Adding Regular Expression Matching to XPath in Identity Manager

0 Likes

One of the things I love about Identity Manager is that there is always another way to accomplish something. But that can be frustrating as well, if the way I thought I wanted to do it does not actually work the way I thought it would. As an example, I was working on a DelimText driver to feed some data to an application in a CSV file. This application needs data from the homePostalAddress attribute, but only for certain specific values of homePostalAddress.



The homePostalAddress attribute is defined in RFC1274 (http://www.ietf.org/rfc/rfc1274.txt) as a structure containing 6 strings as components, each of which "should" be no more than 30 characters long. eDirectory impliments this faithfully, with a structured attribute containing six strings. All six must be present, though they need not all contain actual data. This can make dealing with the homePostalAddress especially challenging in the DelimText driver, which wants to write out attribute (string) data to a CSV file.



In the particular implimentation I was working on, I needed to check each of the strings to see if there was actual data there, then using only the ones that had something non-blank in the component, write out the string values. Because homePostalAddress is a structured attribute of string components, XPath was the first thing I thought of. XPath allows us to refer to an individual component of the structure and to extract the string value.



In my XDS documents, I have sample data like:




<add-attr attr-name="homePostalAddress">
<value timestamp="1239954984#7" type="structured">
<component name="string">Ellwood Blues</component>
<component name="string">1060 West Addison</component>
<component name="string">" "</component>
<component name="string">Chicago</component>
<component name="string">IL</component>
<component name="string">60611</component>
</value>
</add-attr>



and:




<add-attr attr-name="homePostalAddress">
<value timestamp="1239954984#7" type="structured">
<component name="string">Herman Munster</component>
<component name="string">1313 Mockingbird Ln.</component>
<component name="string">Mockingbird Heights</component>
<component name="string">California</component>
<component name="string">90001</component>
<component name="string">1313</component>
</value>
</add-attr>



Note how the first sample has only five useful strings, and one blank, while the second has all six strings of the structure in use.



To make things easier to work with when processing <add>, <modify>, and <instance> documents, the first thing I did was to get the value of the operational attribute (if available) into a local variable called homeAddr. This hides the difference between an <add-attr> and an <attr> which can otherwise make this more difficult than necessary. The local variable must be of type nodeset, so that it maintains the structure:




<do-set-local-variable name="homeAddr" scope="policy">
<arg-node-set>
<token-op-attr name="homePostalAddress"/>
</arg-node-set>
</do-set-local-variable>



Now that we have the value in a local variable, the XPath expression to refer to the first string of this structure is:




$homeAddr/component[1]



The other five string components can be accessed the same way.



I only want the components that contain at least one non-blank character in them. This would be easy, I though, all I have to do is make sure that the component string matches a simple regular expression (".*[A-Za-z0-9] .*"). If it does not, I don't want the value.



Initially I thought this should be easy. Just use an <if-xpath> expression is true like:




<do-if>
<arg-conditions>
<and>
<if-xpath op="true">$homeAddr/component[1]=".*[A-Za-z0-9] .*"</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>



Which would be great, except that in XPath "=" is a string equals compare, and does not support regular expression matching.



I initially worked around this with a bunch of local variables:




<do-set-local-variable name="hpa1" scope="policy">
<arg-string>
<token-xpath expression="$homeAddr/component[1]"/>
</arg-string>
</do-set-local-variable>
<do-set-local-variable name="hpa2" scope="policy">
<arg-string>
<token-xpath expression="$homeAddr/component[2]"/>
</arg-string>
</do-set-local-variable>
<do-set-local-variable name="hpa3" scope="policy">
<arg-string>
<token-xpath expression="$homeAddr/component[3]"/>
</arg-string>
</do-set-local-variable>
<do-set-local-variable name="hpa4" scope="policy">
<arg-string>
<token-xpath expression="$homeAddr/component[4]"/>
</arg-string>
</do-set-local-variable>
<do-set-local-variable name="hpa5" scope="policy">
<arg-string>
<token-xpath expression="$homeAddr/component[5]"/>
</arg-string>
</do-set-local-variable>
<do-set-local-variable name="hpa6" scope="policy">
<arg-string>
<token-xpath expression="$homeAddr/component[6]"/>
</arg-string>
</do-set-local-variable>

<do-if>
<arg-conditions>
<and>
<if-local-variable mode="regex" name="hpa1" op="equal">.*[A-Za-z0-9] .*</if-local-variable>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-local-variable mode="regex" name="hpa2" op="equal">.*[A-Za-z0-9] .*</if-local-variable>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-local-variable mode="regex" name="hpa3" op="equal">.*[A-Za-z0-9] .*</if-local-variable>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-local-variable mode="regex" name="hpa4" op="equal">.*[A-Za-z0-9] .*</if-local-variable>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-local-variable mode="regex" name="hpa5" op="equal">.*[A-Za-z0-9] .*</if-local-variable>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-local-variable mode="regex" name="hpa6" op="equal">.*[A-Za-z0-9] .*</if-local-variable>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>



which does work, but with six components to check, the resulting policy code was somewhat ugly. I really wanted to have a regular expression match without having to go through local variables to do it.



As I said in the introduction, the strength of IDM is that there is always another way to do something, even if it is not immediately obvious how. You can even invent your own new way to do something. So I decided I was going to make it do what I wanted in the first place.



IDM allows you to extend the functionality of the product with ECMAScript programming. You can write your own new extention functions, then call them from within your policies, to do anything you can imagine. Better yet, some research with Google showed that ECMAScript has the regular expression handling I wanted to use as built in functionality.



In my projects, each ID Vault has a Policy Library. This is used to store common policy sets used by multiple drivers. It is also used to store ECMAScript programs.



So as the first step, in IDM Designer (http://www.novell.com/coolsolutions/dirxml/designer/) I created a new ECMAScript object in my Policy Library called "Matches RegEx" and put this code in it:




/** Return true or false based on a Regular Expression match
* @param {String} s1 the string to test
* @param {String} s2 the regex string to test it with
* @type Boolean
* @return a boolean true or false
*/
function matches(s1,s2)
{
// Build search and replace options.
var options = "";

var re = new RegExp(s2, options);
if (s1.match(re)) {
return true;
} else {
return false;
}
}



What this defines is a new function called "matches". It takes two runtime parameters. The first is the string to be tested, the second is a regular expression. The return value, a "boolean", will be either "True" or "False". You can test this in Designer to see that it works, with samples like:




matches("fish","[a-z] ")


and:




matches("fish","[0-9] ")


I have chosen, for the moment at least, not to support any of the options to the RegExp() function. Supporting options like case insensitive tests is left as an excercise for the reader.



Once this has been created and saved in to the Designer project, on the Driver that is going to use it, open up the Driver Properties page, go to Driver Configuration, and on the ECMAScript tab Add the "Matches RegEx" script.



Now that the Driver has been linked to its ECMAScript from the Library, it is time to make use of it. Returning to my policy from earlier, I can now do what I wanted to do in the first place, using my new ECMAScript extention to do it. In the policy, declare the ECMAScript namespace (es):




<policy xmlns:es="http://www.novell.com/nxsl/ecmascript">



Then, the <if-xpath> expression can be:




<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[1]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>



This uses the string() function to get the value of $homeAddr/component[1] and casts it to a string value. Then the string and the regular expression are passed to the matches() function, which does the regular expresion compare and returns True or False. All that remained to do then was to repeat this for the remaining five components. The resulting policy looks like:




<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[1]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[2]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[3]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[4]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[5]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>
<do-if>
<arg-conditions>
<and>
<if-xpath op="true">es:matches(string($homeAddr/component[6]),".*[A-Za-z0-9] .*")</if-xpath>
</and>
</arg-conditions>
<arg-actions>
[...do some actions here...]
</arg-actions>
<arg-actions/>
</do-if>



The <if-xpath> token now has Regular Expression support, and the policy code is much simpler and easier to follow.



Labels:

How To-Best Practice
Comment List
  • AJC is an ecmascript library that has shipped with some IDM drivers since at least IDM 3.6
    In IDM 4.x this is packaged as NOVLLIBAJC-JS

    There is an equivalent regex match function contained in this ecmascript library.

    syntax is the same as outlined in this cool solution, however the function is called match not matches.
  • if you like minimalistic code like me, try the following :-)

    function matches(txt,re)
    {
    return Boolean(txt.match(re));
    }
Related
Recommended