Data Collection Service Driver Walkthrough - Part 5

With Novell Identity Manager 4.0 there are a number of new features available. You can read more about those features in these articles:

There are four new drivers, two are for new connected system supported ( and Sharepoint) and two are used as service drivers that are needed for the Reporting module.

These two drivers are the Managed System Gateway (MSG) driver, and Data Collection Service (DCS) driver, the first of which you can read about in this series of articles:

The Data Collection Service driver, is the second half of that pairing, and will be the topic of this series. Both these drivers are meant to enable the Reporting module to get enough information about the system to report upon it. The MSG driver is focused more on providing information about how the drivers are configured, heck it even tries to infer the matching rule criteria by reading the rules out of the objects, and the DCS driver is focused more on collecting events about objects for storage in the Reporting database.

In the first article Data Collection Service Driver Walkthrough - Part 1 I started looking at how it builds a cache variable and got through most of the work it does to get the correct IP Address of the server running the Managed System Gateway driver.

In the second article Data Collection Service Driver Walkthrough - Part 2 I finished working through how the cache is built.

In the third article Data Collection Service Driver Walkthrough - Part 3 I look at how queries out of the cache are handled, and what the filter for this driver sends to the shim.

In the fourth article Data Collection Service Driver Walkthrough - Part 4 I will finish up one last rule in the Input Transform that handles some error cases and start discussing the Subscriber channel.

In this article I will work through more of the Subscriber Event Transformation policies.

In the last article I was talking about how the registration cache that is maintained is dirtied when certain things change. Specifically the Managed System Gateway (MSG) driver, or this driver itself. On the one hand, the policy in the Subscriber event transform flags a variable to dirty the cache, but it also does something interesting as well that I did not notice in my review of the MSG driver. It actually writes an attribute out to the shim to tell it the cache has been dirtied. I went back and checked and I did not miss it, the MSG driver does not do this, which is interesting in and of itself! To do this is sets the destination attribute updateRegistration, on the MSG driver object, and adds in a src-dc XML attribute with the DN of the Managed System Gateway driver.

The last rule in the Subscriber Event Transformation policy set is the NOVLIDMDCSB-sub-etp-EnrichEvent. This rule is pretty simple and it tries to maintain the XML attribute cached-time. You may notice that when events start to pile up in Identity Manager (IDM) they will have an XML attribute cached-time. This is to indicate when the event happened as opposed to 'now' when it is being processed. This is especially useful in a reporting database tool, that wants to understand and know about all object changes, and just as importantly, when those objects changed.

This seems like splitting hairs at first, but recall that you might stop a driver, let events pile up for days or weeks, and then when you start the driver next they process out of the queue. This is the infamous TAO file of so many discussions, usually as part of an article but this is a good article on how you might see the contents of that cache: Driver Cache Stats in the IDM 3.6 iManager Plugins

Thus the processing of the event 'now' in this context might be heavily skewed from when the event actually happened in eDirectory, if the driver had been down for meaningful amounts of time.

When a modify for an un-associated object comes in, you get what is known as a synthetic add. That is, while the event was a simple modify event, before IDM can modify the object in the connected system (or eDirectory if we were talking about the Publisher channel) it has to create first, thus we get an add event being synthesized from the modify event. To do this the engine reads back the information from the source system, everything that is in the filter, and then generates a new add event. This is not entirely rebuilt from whole cloth but is pretty close. Some things will carry over, but others will not. As it turns out the driver authors noticed that one of those things that will be dropped in such an event is the cached-time XML attribute. Thus we have this rule.

It uses an operation property to store the cached-time data and later we will see the policy will set the XML attribute based on the operation data. Operation data is very useful, and as we have seen in the articles on this and the MSG driver, is heavily leveraged to carry payloads around. This is a nice simple example of how you might use it.

Very often on a clients project I will want to email error events to myself to stay on top of the situation. When you do that, you usually react to a status document in the Input Transform. The problem is that there is no source DN or destination DN in that status document. As a consequence, I very often the exact same thing as this rule, of using an operation property to carry the objects DN, and then the operation type. This way, when I get an LDAP error in the Active Directory driver of unwilling to perform, I know it is a modify on the com\acme\users\jsmith object.

One last comment about this rule, and that is a kudos to the author, since they put in the comments a simple one liner, "@cached-time is lost on synthetic adds" which really helps as it tells me much of what I needed to know as to WHY this rule is doing what it is doing. So thank you guys at Novell! I really appreciate when you do this! Keep it up and do it more often please!

That wraps up the Event Transformation policy set. Like many service drivers, this one does not have most of the policy set options populated with rules. This driver actually has some policies on the Subscriber channel, which is interesting. Next up is the Create rule, as there is little sense matching into a reporting database. If the object does not exist, then it is not associated so create it new. Though even as I say that, the association value is going to be the object GUID, and that would be appropriate to match on. I wonder what is going on there and why they do not bother matching? This would be interesting to test with a deleted association and to see what happened in the driver and how the shim itself would handle a match.

The Creation policy set her has a couple of required attributes. The first two are kind of odd, since they are actually things that schema pretty much requires and would be very hard to skip. The Create rule requires a GUID and an Object Class. Well to be fair, schema requires an Object Class and you pretty much cannot generate an object in eDirectory without an object class. To the point that if you do not have one, it becomes Unknown by default, since you must have a valid object class.

As for GUID, that too is a somewhat mandatory attribute, and if the object does not have one, then something bigger is wrong with your eDirectory. The GUID is not the same sort of mandatory attribute as Object Class, since you cannot really specify a GUID value. Well it turns out IDM can, so be careful not to sync it, or else you could end up with matched GUIDS, but in general you never set a GUID explicitly. eDirectory is expected to generate it for you in the background. However, since the filter might be modified to block either of these two attributes and it possibly would cause bad things to happen in the Reporting modules database, then maybe this is not such a big deal.

Finally there is a filter on DirXML-Resource objects. This is a neat object class added in Identity Manager 3.5 or so, and was initially used for a couple of things like Mapping tables, and ECMA Script objects. Basically the DirXML-Resource class is modified by the DirXML-ContentType attribute, to indicate to Designer which editor to open. Ultimately it just stores raw XML or text, but how that is displayed depends on the content type. This object class supports a number of different DirXML-ContentTypes, like:
Mapping Tables
Single Sign On (SSO) Credential Repository
Single Sign On (SSO) Application
Raw Text
ECMA Script object
DS Object
Package Prompt (new for IDM 4 Packages)
Filter Extension (new for IDM 4 Packages)
Entitlement Configuration

This is a nice model as it means new object types can be easily added by just writing a user interface to manage the underlying XML.

It is the last type of Resource that this driver is interested, the Entitlement Configuration. With the release of Roles Based Provisioning, an additional abstraction layer was added on top of Entitlements. Now we have Resources which are usually mapped one to one with Entitlements. As a friend said "Entitlements are for computers, Resources are for people" (Thanks Mike W.!). There needs to be a mapping of the various entitlements a driver maintains to resources, and to understand what entitlements the driver provides, the RBPM module expects a specific object in each driver to be available. The later driver configurations generate this object on every driver restart, so it is always up to date. You can read more about the policies that generate this object, in the following article: Converting Entitlements to Resources, more details

That article was written to explain what is happening in the polices that John Dasilva and Volker Scheuber had posted an article telling you HOW to add the policies to your driver. Two different perspectives, equally important! You need to understand what is going on, and why it is doing that before you can do any meaningful troubleshooting.

As this object reflects the current state of Entitlements within a driver, and this would be of great interest to the reporting module you can see why the driver is eventing on this object class. Of course it would be wasteful to pass in events from all the various different objects that share the same object class, but do totally unrelated things. Thus we veto for DirXML-Resource objects, if DirXML-ContentType is available nor equal to text/vnd.novell.idm.entitlementConfiguration xml.

Personally I think that watching Mapping table data changing would be interesting to report on, specifically if you had configuration data in such a table. After all most of the time Mapping tables are used to hold configuration data, like perhaps placement data. For example, a User with a Department value of Finance, might thus be placed in a Finance container. Perhaps into a Finance group as well.

I was at a client who had a series of 170 rules to handle all the cases of some entitlement value. It was testing if the value equals X, then add them to group Y. Worst of all, it was a single valued attribute. I explained and showed sample code of how to do it in 2 rules. Two rules because one rule is needed to add to a group and the second to remove the from the old group, though I could have done it all in one if had chosen to do so.

That sort of thing would be good to track, in Reporting, when changes to such a mapping table is made, as you would need to know that to understand group or OU placement rules at any specific time, as well as the users current state to get the full picture. This is probably an enhancement request though.

A fun twist on that approach is to have a driver watch for any changes to the mapping table object, and if there are, to reevaluate all the placements or memberships. That is a really classy approach, such that when you update the mapping table it goes out and fixes all users to match. Much trickier to write, but really useful if this is going to happen. An example where I did that was in a GroupWise driver, you can set GroupWise client settings via the driver. We used groups to apply these settings in Active Directory. The group needed the XML definition in the Info node (yes I know, this is a SQL injection style security hole) and if one of these groups were to change, we would take the member list and go reset the values to the new ones, thus keeping everything is sync nicely. Much easier than having to go fix it by hand when you decide to push a setting change. Of course you could just remove and readd the Users to the group as well, but that is more manual than we wanted.

Anyway, that rounds up the Create Rule. On to the Command Transformation rule in the next article.


How To-Best Practice
Comment List