Trying to understand the Managed System Gateway driver in IDM 4 - Part 4

0 Likes
In Identity Manager 4.0 Novell has introduced a number of new features. There are four new driver configurations, two for applications (Salesforce.com and Sharepoint) and two for IDM itself to use, the Managed System Gateway driver and the Data Collection Service driver.

The Managed System Gateway driver is primarily used by the Reporting module to get information about users out of IDM and into the Reporting database. This is somewhat analogous to how the Identity Audit extension policies that were added to drivers are used to get Identity information into the Sentinel database.

As with many things in IDM 4, this is totally new stuff, and will take some time to get used too. You can read more about the changes between the various IDM versions in these articles:






One of the main new features is Packages, which is critical for all this working, as there are packages that add support for the reporting module to each of the many driver configurations. In fact the same approach is used for the Identity audit extensions as well. This is different than in the past, where the policies were stored in Libraries, centrally and linked into each driver as needed. Now with packages the content is actually duplicated in many places, but with Packages, upgrades are made easier than the previous model. I have been working on a series on Packages in Designer 4, that you can use to gain some insight:











The Managed System Gateway (MSG) driver is one interesting critter. It is doing all sorts of funky and interesting things that it is worth discussing the low level functionality. After all, if you do not know what it is supposed to be doing, how would you know what it is not doing, when it is not working. Most connected system drivers are pretty traditional, that is an event comes out of the application of eDirectory, as an XDS document (which is what the shims job is, convert the applications event into XDS and convert XDS into things the application understands) which is then processed in the flow.


  • In the first article Trying to understand the Managed System Gateway driver in IDM 4 - Part 1 of this series I started looking at how it builds up a cache of various interesting things. It stores them as driver scoped local variables. That means they are available while the driver is running. On every restart they need to be recalculated. There is the SERVER_INTERFACE, and DRIVER_DN variables that we talked about. These are in the Input Transform, in a series of policy objects.

  • In the second article Trying to understand the Managed System Gateway driver in IDM 4 - Part 2 I started looking at one of the most audacious things I have ever seen a driver policy try to do! Read back another drivers DirXML Script policy, and figure out what it is doing! I was deeply astonished when I realized what it was trying to do! In fact my first reaction was, no way are they that crazy. Well apparently they are. I got through how they detect that a User is being managed, and how they detect that attribute based matching is happening.



In this article I would like to discuss some short comings of that approach, and move on to how the queries are handled from these cache variables.

In the previous article we saw that the Managed System Gateway (MSG) driver reads back the Matching policies in both channels, and looks for references to User objects, both in for direct references (If class-name = User or Match user, via Regular expressions, like User|Group) and indirect (if there is no explicit reference to the User object, then it must be indirectly referenced, perhaps in a case where there is only a single matching rule that applies to all objects). Then it looks for Find Matching object tokens, and looks at the read-attr nodes to understand what attribute is being used to identify the object.

This would probably have issues with a couple of common cases I have seen and in fact implemented. For example, if you had a test for if class name equals Group, and then another matching rule that looked for if class name not equal to Group, then you would have an indirect match, but the attribute used might be for the Group as opposed to the User object. This would probably be pretty common, and might not be too big a deal depending on exactly how the Reporting application uses the matching data information. I was informed that the notion was to provide a way for the Reporting application to report on associated objects, of course (using the added code in the driver via a Package) but also to report on the unassociated users, and possible suggest a reason why they might not have matched. If you have now correctly inferred that this driver is matching on the Full Name attribute, and you find unmatched users, you might do a check in the database to see if such other users with the same full name exist (a duplication case) or perhaps do not exist anywhere else (did not match). This gives some additional useful insight into the connected systems.

If you did your matching other than in the Matching policy set (as we noted in the previous couple of articles, the policy set that a specific Policy object is linked too, is determined by the DirXML-Policies attribute, which stores a DN (the policy object), an interval (a numerical representation of each of the possible policy set locations) and a level (the ordering of the various linked policies in this policy set), then of course this rule would fail. However that too is a pretty rare edge case, since in the DirXML 1.1a days that was a pretty common approach, but ever since NSure Identity Manager 2.0 that has not really been an issue.

Overall this is a very ambitious process to have attempted, and it looks like they struck a nice balance of complexity and elegance. I am impressed.

Now on to the rest of the interesting things this driver can do.

Now that we have the Server Cache, the Driver Cache, the Managed System Cache, and the Rule Cache data all initialized, how is it utilized?

Well, this is also a very interesting approach. When the MSG driver shim needs to get some of this information it uses a standard XDS (Novell's XML dialect for Identity Manager) query document, but the main trick is that the class-name XML attribute is set to some API call name.

Here is an example coming from the shim on the Publisher channel, of the driver querying for data in the Server cache. We get a query document with the class-name of SERVER_INTERFACE, asking for the value, of the search-attr named protocol, whose node component has the value NCP.


[11/17/10 16:29:30.751]:Managed System Gateway Driver :
<nds dtdversion="3.5" ndsversion="8.x">
<source>
<product build="4.0.0" instance="Managed System Gateway Driver" version="4.0.0">Identity Manager Managed System Gateway Driver</product>
<contact>Novell, Inc.</contact>
</source>
<input>
<query class-name="SERVER_INTERFACE" scope="subtree">
<search-class class-name="SERVER_INTERFACE"/>
<search-attr attr-name="protocol">
<value>NCP</value>
</search-attr>
<read-attr/>
</query>
</input>
</nds>



In the Publisher event transformation Policy set there is a set of rules, one per type of query, to make debugging easier, that transforms the query, by performing the query against the appropriate cache variable and getting back the requested data. Or perhaps more correctly, selects via XPATH the correct result from the cache variable.

For the other cases, of reading out of the cache it is a bit simpler, though each has its complexities. For example, reading out of the RULE_CACHE variable uses a simple XPATH of:
$RULE__CACHE/cache/instance[@class-name=$api-name][association=$guid]



This means in the RULE_CACHE variable, under the <cache> node, find the <instance> node who class-name is the api-name (which in this case is MANAGED_SYSTEM_MATCHING) and whose association value (which is the driver GUID we stored before in the driver instance cache) matches the GUID the query came in with. (Note that the SERVER_INTERFACE query does not have an association, so this will clearly not work as shown there).

For the specific case of the SERVER_INTERFACE api query, it is a bit trickier. the Publisher channel, event transform policy object NOVLIDMMSGWB-pub-etp-DispatchServerQuery is slightly more complex than the NOVLIDMMSGWB-pub-etp-DispatchRuleQuery example.

First this rule needs to get the prefferredProtocol out of the query doc with this XPATH:
./search-attr[@attr-name='protocol']/value[1]/text()



I think the leading period is superfluous but that is not a big deal. Find the <search-attr> node whose attr-name is protocol, and then get the string from the first value node. Not having seen enough examples, I am not sure if the predicate [1] is even needed, since it would depend if they ever sent a multivalued search node. However it still works fine. The predicate [1] is actually shorthand notation for [position()=1] but the short way is so much nicer.

There is a local variable named remote, that the driver sets up if it is running as remote loader, versus local in an eDirectory instance. This has a consequence here, as when running in local in an eDirectory instance, it can make an NCP connection, but when running in remote, it looks like they should be using LDAPS.

So depending on which situation you are in, it might return LDAPS, even if you asked for NCP. Regardless of the protocol choice, it basically selects the data from the cache variable with the XPATH of:

$SRVR__CACHE/cache/instance[@class-name=$api-name]//value[component[@name='protocol']/text()=$protocol]



That is, in the SRVR_CACHE variable, under the <cache> node, find the <instance> node whose class-name is the queried api-name (set as a variable a moment earlier), and then select any (//) <value> nodes who have a component named protocol, and whose string value, is equal to the requested protocol name in the query doc. (The $protocol variable was of course also set a moment ago as well in our XPATH example for prefferredProtocol).

The rest of the complexity of this particular rule, is actually nicely commented, which I am quite happy to see, using traced messages, to indicate that the connection the driver can make depends on how it is running. That is, if running remote it needs to use LDAPS to connect, not NCP, and in fact it tries to send the right response, even if asked for the incorrect information. Very nicely done.

I did notice that in this driver, a neat trick someone showed me, has been used. That is, I love the comments fields for a rule object, and use it very heavily. In fact, when I did a SOAP driver, I found that pasting sample XML documents at each stage of the process, into the rules converting them from one format to another, to be a huge help. In the Input and Output transform, you need to convert the two XML dialects between XDS (Novell's) and your SOAP dialect. Well as you develop that, you can test it with Simulator in Designer, but that needs a sample document, which you might have saved somewhere, or not. By pasting the actual XML into the comment field, you get to see the before and after when you look at the rule, which makes understanding what it is doing, without trace handy much easier. But even better, if you copy the XML back out, you can paste it into Simulator, and test out the rule you are building, or even more importantly, trying to understand an issue. That is, if you come back to a rule a year later and now something has changed and it is not working. You have a copy of what did work, you can compare it to what you are getting now, see the difference, figure out how to handle it, and test it in Simulator. Very useful. If you are interested in more information on that SOAP driver discussion you can read about it in these articles:












Thus the data it needs to return is the following snippet:

<instance class-name="SERVER_INTERFACE" src-dn="CN=idv,OU=servers,O=system">
<association>BD9194F1-001A-2549-5C8A-BD9194F1001A</association>
<attr attr-name="interface">
<value type="structured">
<component name="protocol">NCP</component>
<component name="address">172.17.5.111</component>
<component name="port">524</component>
</value>
</attr>
</instance>



But this is still in the Publisher channel, and eDirectory is not going to intelligibly respond to this query anyway, since the object class does not exist, so you would get back an empty response document. The policy then adds the cache result to the query document in the operation-data node, which the engine tracks, since it strips it off just as it is submitted either to the shim of the engine (though I think it may just ignore it on the engine side, but it definitely strips it before it hits the shim, and you will see a line in Dstrace noting that it has been stripped), and once the document is returned, the engine will re-add the tracked operation-data to the result document.

This is meant to be an approach to allow you to pass data between channels quite easily.

The query after the Publisher event transform looks something like this:

<nds dtdversion="3.5" ndsversion="8.x">
<source>
<product build="4.0.0" instance="Managed System Gateway Driver" version="4.0.0">Identity Manager Managed System Gateway Driver</product>
<contact>Novell, Inc.</contact>
</source>
<input>
<query class-name="SERVER_INTERFACE" scope="subtree">
<search-class class-name="SERVER_INTERFACE"/>
<search-attr attr-name="protocol">
<value>NCP</value>
</search-attr>
<read-attr/>
<operation-data api-name="SERVER_INTERFACE">
<instance class-name="SERVER_INTERFACE" src-dn="CN=idv,OU=servers,O=system">
<association>BD9194F1-001A-2549-5C8A-BD9194F1001A</association>
<attr attr-name="interface">
<value type="structured">
<component name="protocol">NCP</component>
<component name="address">172.17.5.111</component>
<component name="port">524</component>
</value>
</attr>
</instance>
</operation-data>
</query>
</input>
</nds>



You can see the entire <instance> node is returned inside the <operation-data> node.

The engine should return for the query an empty result document, since of course there are no such objects known as SERVER_INTERFACE in eDirectory: But as I read through the trace there does not seem to be any response document. Or at least not shown in trace, since there is a status event (not shown traced out as XML, but when the rules in the Output transform fire, they say they are applying to Status #1).

Regardless of how that little bit of machinery works, there is a response of sorts, and it includes the operation-data node, and then we reach the Output transform, where there is a great rule that simply clones the contents of the operation-data node, to a sibling of the current context (the <instance> node, which is where XPATH when used in IDM Policy starts its context) and then strips out the empty response document and cleans up the operation-data node, leaving us with, what looks like a returned value from eDirectory with our results, except of course it was read out of the cache. Pretty cool approach I must say!

<nds dtdversion="3.5" ndsversion="8.x">
<source>
<product version="4.0.0">DirXML</product>
<contact>Novell, Inc.</contact>
</source>
<output>
<instance class-name="SERVER_INTERFACE" src-dn="CN=idv,OU=servers,O=system">
<association>BD9194F1-001A-2549-5C8A-BD9194F1001A</association>
<attr attr-name="interface">
<value type="structured">
<component name="protocol">NCP</component>
<component name="address">172.17.5.111</component>
<component name="port">524</component>
</value>
</attr>
</instance>
</output>
</nds>



Someone was being very clever with this driver, which just makes it much more fun to learn from and work through. Hope you are enjoying it as much as I am. Stay tuned for more in part 5 of this series.

Labels:

How To-Best Practice
Comment List
Related
Recommended