Open Call - Where does a Node set context differ from a string context?

0 Likes
over 6 years ago

Here is the Package file attached with examples from the article.


CIS-NODE-STR_0.0.1.20141015085602.zip

One of the cool things I like about IDM is that I am constantly learning new things, new tricks, new features. There are some minor things in Designer that I only found out about, after using it for years. There are behaviors of the engine still to be learned, I would expect.

What is interesting is how much of this stuff is undocumented. Mostly this is because they are edge cases, or interestingly strange behaviors that do not normally occur. Or else the bulk of things to document is so great that 100% coverage is unlikely to ever occur. But since we have Cool Solutions, I can write about stuff I find that is missed.

One very interesting thing I have been meaning to write about for a while is the difference in how 'things' behave in a nodeset vs string context.

That does not seem like a clear concept, but if you are used to IDM it should be fairly straightforward. Mostly, when you do XPATH or use local variables, you need to know if this variable, or target of your XPATH is a nodeset or a string.

Obviously, XPATH differs if you have a string or a nodeset. A nodeset of XML you can walk by specifying nodes, with predicates and whatnot. A string you can use the various string functions on. (contains(), substring-before(), etc). But you need to know what your target looks like before approaching using or manipulating it.

Well what is you can define a target, and you get different results if you store it in a nodeset variable, or in a string variable. (Heck, how about examples of an object type variable? That would be cool!)

My plan for this article is that I will write up as many examples as I can think of, and build them all into a Package I will attach to the article. If you have additional examples, contact me (email if you can guess it, my full name using GMail, or leave a comment) and I will try to add in your example, and update the Package with an example of the difference that you can run through Simulator and play with.

Take a look at the first rule "[CIS] First example - list GCV" in the Packaged Subscriber Event Transform policy "CIS-NODE-STR-Examples by Geoffrey" for the examples I will be working through below.

The simplest example I can think of is a List type Global Configuration. This consists of multiple values, shown in the GUI. Then if you set a local variable, string type, you get the values separated, by the specified separator value. (When you define a List type GCV there is an XML attribute item-separator=","). However when you treat it like a nodeset of values that you can loop over or do tricks with.

What is even more interesting is that if you do not XML Serialize the nodeset, but just try use it again as a string (say in a Trace token) you actually only get the first value! I personally expected to get all the values concatenated together, but in fact you get only the first.

Here is my first example rule:

<rule>
<description>[CIS] First example - list GCV</description>
<comment xml:space="preserve">Run this through Simulator and look at the output.</comment>
<comment name="author" xml:space="preserve">Geoffrey Carman</comment>
<comment name="version" xml:space="preserve">1</comment>
<comment name="lastchanged" xml:space="preserve">Oct 14, 2014</comment>
<conditions>
<and/>
</conditions>
<actions>
<do-trace-message disabled="true">
<arg-string>
<token-text xml:space="preserve">No conditions, since testing in Simulator.</token-text>
</arg-string>
</do-trace-message>
<do-set-local-variable name="STRING" scope="policy">
<arg-string>
<token-global-variable name="gcv.example.list"/>
</arg-string>
</do-set-local-variable>
<do-set-local-variable name="NODESET" scope="policy">
<arg-node-set>
<token-global-variable name="gcv.example.list"/>
</arg-node-set>
</do-set-local-variable>
<do-trace-message>
<arg-string>
<token-text xml:space="preserve">Look at the output as a STRING:

</token-text>
<token-local-variable name="STRING"/>
<token-text xml:space="preserve">

and now as a NODESET. See what I mean?

</token-text>
<token-xml-serialize>
<token-local-variable name="NODESET"/>
</token-xml-serialize>
<token-text xml:space="preserve">

But even better! Watch what happens when you do not XML Serialize the nodeset!

</token-text>
<token-local-variable name="NODESET"/>
</arg-string>
</do-trace-message>
</actions>
</rule>


Then here is the output:

Generic Null :Look at the output as a STRING:

ItemTheFirst,ItemTheSecond,ItemTheThird

and now as a NODESET. See what I mean?ItemTheFirstItemTheSecondItemTheThird

But even better! Watch what happens when you do not XML Serialize the nodeset!

ItemTheFirst



If you think that is odd, consider the case of Structured GCVs! They are even wilder. A structured GCV is a very cool construct. It allows you to define a GCV that has inside it, numerous GCV's and then you can define multiple instances of those sets of GCVs. A common use case is in the SAP UM fan out configuration. (I think it was added to help support that driver in fact). An SAP UM (User Module) driver needs a couple of pieces of information to connect. Host name of course, but also some SAP service names like Logical System number, etc. Thus in the fan out mode, you need a way to list the same set of 4 things plus a name maybe, multiple times. Well define a Structured GCV that consists of the 4 GCVs you need, and then once defined, in the GCV editor there is a plus sign, and you can add a new instance of that GCV. You need to connect to 8 different SAP UM's? No problem, make 8 instances and there you go.

So what does this look like in the string vs nodeset context? Like the list GCV you need to need to specify separators to delimit the sets of values. But in addition you specify an instance separator. Now imagine you had a List GCV inside there, yet another separator. You can quickly run out of special characters if you are not careful.

This differs a bit from the List GCV for a nodeset as you get the actual XML of the <instance> nodes (But not the <template> node which defines what GCVs are container. Which is probably a good thing). The rule to test this is basically the same, just changed the name of the GCV to the structured one.

Output of that test example in the package looks something like this:
	Generic Null :          Arg Value: "Look at the output as a STRING:

First Example,mySAP.domain.com,1234,PRDCLNT100.Second Example,mySAP.domain.com,4321,QASCLINT200

and now as a NODESET. See what I mean?

<instance>
<definition display-name="Name" name="gcv.example.structured.name" type="string">
<description></description>
<value xml:space="preserve">First Example</value>
</definition>
<definition display-name="Hostname" name="gcv.example.structured.hostname" type="string">
<description></description>
<value xml:space="preserve">mySAP.domain.com</value>
</definition>
<definition display-name="Port" name="gcv.example.structured.port" type="string">
<description></description>
<value xml:space="preserve">1234</value>
</definition>
<definition display-name="Logical System Name" name="gcv.example.structured.lsname" type="string">
<description></description>
<value xml:space="preserve">PRDCLNT100</value>
</definition>
</instance><instance>
<definition display-name="Name" name="gcv.example.structured.name" type="string">
<description></description>
<value xml:space="preserve">Second Example</value>
</definition>
<definition display-name="Hostname" name="gcv.example.structured.hostname" type="string">
<description></description>
<value xml:space="preserve">mySAP.domain.com</value>
</definition>
<definition display-name="Port" name="gcv.example.structured.port" type="string">
<description></description>
<value xml:space="preserve">4321</value>
</definition>
<definition display-name="Logical System Name" name="gcv.example.structured.lsname" type="string">
<description></description>
<value xml:space="preserve">QASCLINT200</value>
</definition>
</instance>

But even better! Watch what happens when you do not XML Serialize the nodeset!

First ExamplemySAP.domain.com1234PRDCLNT100".


One thing to keep in mind, if you loop over this nodeset variable, then the starting point is the <instance> node, so you could For Each over Local Variable NODESET in my example, and you could process XPATH like $NODESET/definition[name="gcv.example.structured.port"]/value to get the Port number for example. (Note, no need for an instance in there as it becomes the current context inside the variable). I added a simple For Each to my second example rule that looks like this:

<do-for-each>
<arg-node-set>
<token-local-variable name="NODESET"/>
</arg-node-set>
<arg-actions>
<do-trace-message>
<arg-string>
<token-xpath expression='$current-node/definition[@name="gcv.example.structured.port"]/value'/>
</arg-string>
</do-trace-message>
</arg-actions>
</do-for-each>


You can see in trace that it shows the two different port number values in trace. (Trace here is boring it just traces out two numbers, but the example shows how it can be done).

There is another category of nodeset vs string changes, specifically in how String or Argument Builder content are treated. When you are in a Set local variable token, and you click the editor button on the last line ("Specify String" line) you get the Argument Builder sub window. The same is true for Specify Nodeset (in a nodeset variable case, or in a For Each token's Specify Nodeset line). But here the interpretation of values in that editor change.

When it is a string context, all the values are concatenated together. That is, you are building a long string, one element at a time.

In a nodeset context, each noun/verb token set is treated as one element of the nodeset. Thus you could not easily build a string to be one value of a nodeset there, you would have to have built it in a previous Set Local Variable token, and then include it here with a Local Variable noun token.

My favorite use case for this, is the For Each case, where you perhaps need to check results from more than one query event. Perhaps you need to loop over the results of queries for users in ContainerA, and also ContainerB, but since they are sibling containers and you wish to ignore the ContainerC sibling, you cannot just start at the parent. In the For Each token, you could have a Query noun token, that has a DN to search from starting in ContainerA, and then a second Query token with a base DN pointing at ContainerB.

Then the results from each are considered part of the list to be looped over. Or perhaps you had two different object classes to look at, but the Query token only supports one class at a time.

If you tried that in a string context, you would get the list of text values concatenated together, which is probably not what you want, and a single loop as the value would count as a single node.

Another odd example would be the case of a nodeset of string values vs a nodeset of XML. An XML nodeset you are all familiar with. You know the common case, you want the DN of an object, so you Set Local Variable QUERY to Query for the object and store it as a nodeset. Then you Set Local Variable SRC-DN as a string to XPATH of $QUERY/@src-dn. The QUERY variable has the <instance> node that the query returned in it so you can do XPATH upon it.

You can also have a nodeset that just holds a set of string values, like in the first example's List GCV nodeset. But they are not XML so how can you use XPATH on them? Well you can do positional stuff.

In my first rule example, the $NODESET variable traces out as:

Generic Null :      Action: do-set-local-variable("NODESET",scope="policy",arg-node-set(token-global-variable("gcv.example.list"))).
Generic Null : arg-node-set(token-global-variable("gcv.example.list"))
Generic Null : token-global-variable("gcv.example.list")
Generic Null : Token Value: {"ItemTheFirst","ItemTheSecond","ItemTheThird"}.


That curly brace notation in trace is trying to show the nodeset contents (top level). To contrast with an XML nodeset, lets see what it looks like in the second rules Structured GCV example:

Generic Null :      Action: do-set-local-variable("NODESET",scope="policy",arg-node-set(token-global-variable("gcv.example.structured"))).
Generic Null : arg-node-set(token-global-variable("gcv.example.structured"))
Generic Null : token-global-variable("gcv.example.structured")
Generic Null : Token Value: {<instance>,<instance>}.


In that first example with a nodeset of values, you could use the XPATH of $NODESET[2] to get the second value. The predicate [2] is actually a shortcut notation for [position()=2] and that seems to matter when you want to use a variable for the position. Maybe something like XPATH of: $NODESET[$POS-VALUE]

I often have issues getting that to work, but if it does not, I find that $NODESET[position()=$POS-VALUE] almost always works.

What did surprise me is that if you try to cast the nodeset of strings to a string, you only get the first value. I really expected all the values concatenated together. I was reviewing a policy in the Office 365 driver that relied on this trick, and I was sure it was broken, but ran it through Simulator and low and behold it did work. That is the sort of thing I worry will one day change, so I would personally not rely on that. (Someone who reads and understands RFC's will no doubt post the part of the XML Path language RFC that explains why it does this, but that stuff is all Greek to me!)

That is 4 examples of differences, now lucky reader, it is your turn. Please suggest an example of your own. Feel free to email me, or post it as a comment. (Don't do stupid forum or CS messaging, I prefer almost anything else).

If it is a good one, I will try and add an example to the Package I will attach to this article, in a new build, so others can see it, and update the article to include it. I know there are more examples, but that is all I can think of at this exact moment, so I look forward to your help in expanding this topic.




Labels:

How To-Best Practice
Comment List
Anonymous
Related Discussions
Recommended