XPATH and the context node


Note: I wrote another attempt at this that is helpful as well, in trying to explain this concept, you can find the new article at: Another Attempt at Explaining the Context Node".

Novell Identity Manager allows you to use a number of approaches to manipulate events that occur in the Identity Vault and in the connected systems.

With the original release of Identity Manager, when it was still called DirXML, the only real option was XSLT. XSLT is still available and some of the drivers still use it. XSLT is very powerful when you need to convert the syntax of the document from one XML based dialect to another, which is why the SOAP and Delimited Text drivers use it. The SOAP driver converts the XDS dialect that Identity Manager uses to the SOAP document dialect (SPML, DSML, or other) that you are trying to communicate with and then in the other direction as needed (Depending on if the event is in the Subscriber (XDS -> SOAP) or Publisher (SOAP -> XDS) channel).

The Delimited Text driver converts XDS event documents to comma separated values (CSV) and vica versa using XSLT.

The SAP HR driver requires the use of XSLT in order to format a funny looking Query document that is used to get the relationship between people (Technically Persons, but I find that plural of the word Person funny looking), Jobs, Positions, and Organizations within the HR system. The query uses nodes that the DTD supports, but are hard if not impossible to do any other way, so XSLT is probably your best approach to using them.

With NSure Identity Manager 2.0 we got DirXML Script, an XML based language that is used to manipulate events. With each subsequent release of Identity Manager more features have been added to DirXML Script. New verbs, tokens, actions, and conditions have been added, as well as interesting enhancements to existing tokens and functions. You can read more about those new features at:

One thing that has stayed the same, is that when it comes time to select parts of the document, do math, or call out to external Java classes you need to use XPATH.

XPATH is a tricky thing, where some parts of it are really easy (Like some of the string and math functions), other parts are pretty straight forward, and things like selecting can be simple in principle but surprisingly tricky in practice.

One source of information about XPATH is the RFC that defines XPATH:

The thing is that the RFC does not really provide examples, and most web resources on XPATH are confusing since they focus on using XPATH for HTML document manipulation, which is not obvious how they might apply to XDS event documents.

Other than that, there are a couple of good sites with some examples:



I have been trying to write some articles about interesting XPATH tidbits, you can read more at:

We still need more I think, so here goes another one.

In the article about using XPATH for doing math versus selecting a node in a node set (Some thoughts on XPATH in Novell Identity Manager) I discussed a couple of ways you can use XPATH. What seems to be most confusing to people is where are we starting from, also known as: What is the current context node.

What illustrated this for me was using the XPATH Simulator in Designer. I kept trying to use it, and nothing ever seemed to work. I was sure I had a valid XPATH selection string, and it should have selected something in the sample document, but I just could not get it to do it. Finally, I walked through the XML document in the object view, and selected the <output> node, and suddenly results appeared!

The issue had been all along that what matters most for XPATH in selecting, is the context node. That is, where are we starting from. There are a number of selection methods that allow you to specify, anywhere this node occurs, but rarely is that sufficiently fine grained for us in an Identity Manager world.

For example, a node selection of -dn is one of the most common ones I find myself using. That is, select the value of the XML attribute of the current context node, called src-dn. The part is so important is the "of the current context node".

For example, if you query for some objects, store the returned node-set in a local variable and then want to loop through them and read out the src-dn then of course the XPATH select statement of "@src-dn" will not be useful, it will be selecting the same thing every time, the src-dn of the originating event.

Thus inside your loop you would be selecting for "$current-node/@src-dn" which leverages a built in local variable for the current node of any looping structure you might be in the middle of. Thus we specify the context to be the local variable, which comes from the loop, and how it looks as an XML node set depends entirely on what the node set you are looping through looks like.

The two most common node sets you will encounter in Novell Identity Manager for doing this sort of task are probably the results of a Query token or the result returned from the use of one of the tokens, Source Attribute, Attribute, or Destination Attribute. (For more on the difference between the three attribute tokens, please read:

The different attribute options in Identity Manager

More thoughts on Source/Destination/Operation attribute tokens in Identity Manager ).

Prior to Identity Manager 3.5, there was no Query token, (for more information see: The Query token in Identity Manager

Examples of using the ParseDN Token in Identity Manager) and we had to use the Java Command Processor. The Java Command Processor is still available for use, but not really needed any longer, since the three attribute tokens (Source Attribute, Attribute, and Destination Attribute), and the Query token very nicely wrap it in a trivial to use interface. That is probably my favorite part of a new Identity Manager version release! Seeing what thing that was a little tricky in the past, is now wrapped in an easy to use interface instead. (Unique name is another example of a wrapper of a reasonable amount of logic, based around the query functions, to generate, and then test for Unique names in a directory)

Regardless of which of the four options you choose to use, the engine in the background will generate a query document. It will ask the source or destination, depending on what you chose, to find all objects of the class you specify (or don't specify), that match some condition (or no match specified), and return a list of attributes (or all attributes if none specified). There are all sorts of interesting ways to tune this, by setting the number of returned values (query-ex functionality, see: The Query token in Identity Manager ), limiting the subtree, or specifying a full DN to return a subset of objects, or a single objects values respectively.

The Source Attribute, Attribute, Destination Attribute tokens basically do a query for a specified DN with no match criteria (since we know the DN, the current objects src-dn or dest-dn. Or in the case of Source and Destination Attribute tokens, we can specify the objects DN or association value) for a specified attribute.

The query document will look something like:

<ndsextra version="3.5" ndsversion="8.x">
<product version=" ">DirXML</product>
<contact>Novell, Inc.</contact>
<query class-name="Organizational Unit" dest-dn="com\acme\People\GA\CN\" scope="entry">
<read-attr attr-name="Object Class"/>

Here we have a standard DirXML XDS document. It is a Query, for a specific object class (Organitional Unit) against the destination system (which happens to be eDirectory in this case, coming from an SAP HR example) for a specific DN (The value of dest-dn) and reading back the Object Class. (The reason for querying the Object class is because it is a mandatory attribute, even Unknown objects, have an object class of Unknown. See this article for more discussion on the issue: XXX comment on Father Ramons article about Object Class).

The response can look something like this:

<nds dtdversion="3.5" ndsversion="8.x">
<product version=" ">DirXML</product>
<contact>Novell, Inc.</contact>
<instance class-name="Organizational Unit" qualified-src-dn="dc=com\O=acme\OU=People\OU=GA\OU=CN" src-dn="\ACME-DEV-AUTH\com\acme\People\GA\CN" src-entry-id="74056">
<attr attr-name="Object Class">
<value timestamp="1207772827#7" type="string">Organizational Unit</value>
<value timestamp="1207772827#8" type="string">ndsLoginProperties</value>
<value timestamp="1207772827#9" type="string">ndsContainerLoginProperties</value>
<value timestamp="1207772827#10" type="string">Top</value>
<status level="success"></status>

In this case, if we were using either the Destination Attribute token, or the Query token we might have generated this query, and be planning on storing its result in a local variable that we set to be of type "nodeset" (as opposed to the default type of "string"). In that case, lets say we called the local variable CHINA-OU then we can start playing around with some XPATH.

Now is where it depends how you generated the query. Lets talk about the Query token first.

The first simple example is:

$CHINA-OU/@src-dn to get the object DN in the destination data store. Now this is a silly example, since we provided it to the query token in the first place, but hey, the point is still valid and would work in other circumstances.

Then we could test to see if we got any values back, and thus if the object we are looking for actually exists. In that case, you might use a condition that tests if the following XPATH statement is true:

$CHINA-OU/attr[@attr-name="Organizational Unit"]/value

That tests for the attr node, in the CHINA-OU node set, that has an XML attribute attr-name, that is equal to Organizational Unit, and then tests to see if there is a value node under it. If there is, it returns true, if there isn't (no value returned) then it would be false.

Thus you can test for the existence of an object.

The key point to take out of this is that we did not specify a XPATH path (was that redundant or what? Like RAM memory?) that was complex like:

$CHINA-OU/nds/output/instance/attr/[@attr-name="Organizational Unit"]/value

it was sufficient to know that the context node for this type of operation in Identity Managers view of XPATH is to start at the node under the output document.

If we had used the Destination Attribute token, and set a local variable to a node set of the results, what we would have in the nodeset would actually be the set of the value nodes. I do not think we would be able to get the -dn out of the document, since we would really only have the value nodes in memory.

Thus we could use a For Each loop on this nodeset, (That is, when you use the for-each action, the node set is the local variable CHINA-OU) and inside the loop, you could XPATH select things $current-node/text() to get the value of each node as you loop through it, or perhaps test in an IF-THEN test, if $current-node[type="string"] to get the string values only. This example seems kind of silly, but actually has utility when looking at Group memberships in systems that do not maintain referential integrity (like Lotus Notes). In those cases, the driver and engine will try and convert the values it is getting from the destination data store for members names, into DN's in eDirectory. Which would by @type="dn" but since the destination actually could care less what value is stored in that field, (it is basically a free form string field in Lotus Notes/Domino) the values that come back may not be valid DN's in eDirectory, or may not be associated yet. You might need to pull out those values to do something with them (like storing them in a multi valued attribute that is string syntax on the group object, or maybe you just want to clean them up, so you would strip them from the operation document with Strip by XPATH expression).

The case of a Query token can be more interesting, where you can return information about multiple objects, for multiple attributes, of which some might be multi-valued attributes. In order to take advantage of all the data you retrieved and get all the bits and pieces you want out of the results, you might need to have three nested for-each loops.

For a Query document that looks like:

<nds dtdversion="3.5" ndsversion="8.x">
<product version=" ">DirXML</product>
<contact>Novell, Inc.</contact>
<query class-name="User" dest-dn="com\acme\People" scope="subtree">
<search-class class-name="User"/>

This would return all User objects in the People.acme.com container in the tree, so it could be one or it could be thousands, use with care! If you return too many values be aware you may run out of Java heap memory. (For thoughts on how much memory a node set takes up, look at these articles: More thoughts on the size of a node set in Identity Manager but suffice it to say, expect about 10K a node to be on the safe side. You can check how much Java heap you have allocated, free, and the maximum values by looking at the examples in this article: Reading and Displaying the Value of Java Heap in Identity Manager Rules.

Now what you will get back is a nodeset with an <instance> document for each object it found.

Inside the <instance> node, there will be an XML attribute src-dn="TREE\O\OU\OU\ObjectName", and then underneath the <instance> node there will be a series of <attr> nodes with an XML attribute attr-name="Attribute Name". Underneath the <attr> node will be either a single <value> node or many <value> nodes at the same level, for multi valued attributes.

You can see how you would need to use three nested for-each loops to get at everything in the query results.

As before, we set a node set local variable called BIGGER-QUERY with the value of that query shown above, and so our first for-each loops nodeset will be the local variable BIGGER-QUERY. (We could have said XPATH of $BIGGER-QUERY which would be the same thing).

Now in this loop, we can do things like set a local variable CURRENT-USER to the XPATH $current-node/@src-dn to store the user's DN for use inside all the nested loops.

This first for each loop would run through all the instance documents, one per User object found, until they are all done.

Then we would have another for-each loop inside the first one, to loop through the nodeset of either local variable current-node or XPATH $current-node.

We can test in this loop if the $current-node (now the local value of this loops current node), is $current-node/attr[@attr-name="Some Attribute"] and possibly do something with the value.

If the attribute is multi valued, then there will be several <value> nodes inside the <attr> node, so we would need a third for each loop to work through that set as well.

The nodeset for our third for each loop would be XPATH of $current-node/value and we could test again like above for @type="string" or the like. Perhaps we want to build a text string to send in an Audit event or an email or something else like that.

Hopefully this example shows pretty simply how the context node matters, and everything is based on where it currently is, when using XPATH.

I think I will need to continue on this thread, and possibly take another swing at it with more examples. Hmm.


How To-Best Practice
Comment List