Common Mistakes Newcomers to IDM Make - Part 6

Novell Identity Manager is a big product with lots of moving parts. It can be quite daunting to get started working with it. Just as bad, the deeper you dig, the more complexity you discover. Now this is not necessarily a bad thing, it is just something you have to deal with. There is a learning curve.

Thus my attempt here to try and lower that curve. First off I have an article with stuff you should start off reading to get over the first hump in the learning curve. If you have not already read this stuff, take a few minutes and do so, it will pay off ten fold as you work through IDM: FAQ: Getting started with IDM, what should you read?

On top of the learning curve, there are some reasonably well known common problems that beginners in the product will run into. I figured that trying to write them down can only help. If nothing else, Google ought to find it, if you search for it.

Thus I started this series, with the first article that you can read here: Common Mistakes Newcomers to IDM Make - Part 1

In that article I covered the concepts:

  • Using the forums

  • Basic articles to read to understand the engine

  • Default DN formatting

  • Verb vs Noun tokens in Argument Builder

  • Time Conversions

  • Built in variables (Local, global, and error)

  • Active Directory driver and SSL

In the second article, Common Mistakes Newcomers to IDM Make - Part 2 I covered examples of:

  • Getting a driver started (migration)

  • Designer vs iManager versions of the project

  • Case sensitivity issues when specifying shim class

  • Move token, specify destination container

  • The Attribute tokens

In the third article Common Mistakes Newcomers to IDM Make - Part 3 I covered examples of:


  • Using the Remote Loader instead of Local

  • Restarting eDir after a code change

  • Tracing to a file per driver

In the fourth article Common Mistakes Newcomers to IDM Make - Part 4 I covered examples of:

  • Per Replica attributes

  • DirXML-Associations

In the fifth article Common Mistakes Newcomers to IDM Make - Part 5 I covered examples of:

  • Variable interpolation with $ vs $$ vs ~~ vs XPATH

  • Regular Expressions

On to part 6!

For this article, I wanted to talk about the first of two concepts in the filter that seem simple, and are to be fair, pretty poorly documented.

Optimize Modify and Merge Authority.

Optimize Modify is quite a useful feature, and is based on an interesting feature of eDirectory. eDirectory is a single lock writer database, which means only one thread can write to the eDirectory database at any time. This seems like it would be a huge bottleneck in terms of performance, and to some degree it is.

However it is no where near as bad as you might initially think. After all, we are now living in an era of laptops with 2, 4, or 8 cores and seriously multithreaded applications. (I mean heck, the PS3 game system has 7 CPUs that can run different threads). And we all know that disk has not progressed any where near as fast as CPU performance and speed, nor RAM performance and speed. (On a side note, Designer loads REALLY fast off an SSD, and I would not give up my Solid State Disk, for just about anything right now! It is the best investment in peace of mind you can make on your work or personal computer! An SSD is so much faster than a spinning disk, it is totally night and day!)

So limiting eDirectory with potentially billions of objects (millions is much more common these days than you would think!) to a single write at a time seems quite painful. But this is not so bad, since after all eDirectory is mostly read (LDAP Search for example) events instead of Write events. In fact it is hugely unbalanced in terms of reads and writes in any typical installation. If you have a use case that is lopsided the other way, it may be that eDirectory is NOT the best choice for you!

Identity Manager runs inside eDirectory's memory space (thus on 32 bit eDirectory the 2GB process limit for combing eDir DIB cache size and Java heap size) and thus is very dependant upon eDirectory. Thus as you can imagine, the performance of Identity Manager is quite dependant upon write performance of eDirectory.

Even worse, Identity Manager is one of those things that does write to the database quite often. The designers of Identity Manager were quite clever folk, and they implemented a number of things to try and make this less of a problem. They could not fix the single writer lock issue, but they could minimize its affect.

As a bizarre accommodation of this process, if you have a multi valued attribute and it has more than 25 values in any single instance of the attribute, then you will get a system generated index. This is because of how eDirectory works, when adding a new value. It has to check and see if there is any existing attribute of the same value, and if there it, throws an error. -614, DUPLICATE VALUE.

You can read more about this in the article: System Generated Indexes in eDirectory

This article is interesting, since you cannot delete system generated indexes, and if you ever wondered why eDirectory was suddenly making these system generated indexes on you, it can help explain it.

Thus every time you write to an attribute that has a value, a query is actually done by eDirectory to validate this value is not already there. The overhead in doing that, once you cross a threshold of 25 values was considered by the engineers working on eDirectory to be greater than the cost of maintaining an index, when compared to the performance benefit the index would provide.

This is additionally relevant in the case of Optimize Modify, since at some levels, this is what Optimize modify is trying to prevent from happening.

When an attribute is flagged as Optimize Modify and a modify event for the attribute comes through the engine on its way to eDirectory, the engine looks at the current state of the object in eDirectory, compares to the change being proposed, and decides if there is actually any work to do.

This sort of event can come from the Publisher channel of most drivers, or by the user of the source attribute tokens (Add or set value) in the Subscriber channel.

This is actually relied upon to a great extent by some drivers (SAP HR, and some SOAP services come to mind) where it seems like every event on the Publisher channel is the entire object, which might 20 or 30 different attributes. When the event is being submitted to eDirectory, the engine checks the current values in eDirectory, compares to the proposed changes in the event document, and reports the actual changes needed. This can be as simple as a single attribute out of 30 is changing, every attribute on the object is being modified, of more confusingly, NOTHING is being changed.

In that later case you will see in trace a message that Optimize modify discard transaction or the like, which is telling you that in the end, nothing happened.

Probably the most common example of that happening is with the Active Directory driver, when a password is changed. The Active Directory driver has a very confusing pattern of behavior, which happens because Active Directory does not track who the last modifier was when something changes in Active Directory. This is unlike eDirectory which has some operational attributes (aka you cannot write to them, the system manages them) like creatorsName and modifiersName for who created and who last touched the object. (You can imagine that trying to write to modifiersName would be silly, especially if YOU try to change it to someone else. Well that is wrong, since you are the last modifier, trying to change it to a different last modifier...)

Thus every event sent to Active Directory via the driver on the Subscriber channel will loop back on the Publisher channel a few seconds later. This is because Active Directory cannot tell that the driver just made the change and discard it. (Which the Subscriber channel on all drivers will do. If your driver writes something to eDirectory on the Publisher channel, it will not generate an event on that same drivers Subscriber channel. Which is good else you could get into all sorts of horrible loops. But this has a negative affect, that you might need to make a change in DriverA's Pub channel and have DriverB's Sub channel watch for it to tweak the event further, so that DriverA can then respond on its Sub channel. The benefits of loopback protection very much outweighs the downsides).

Well this Loopback event in Active Directory is usually handled out of the box by Optimize Modify. If the attributes are bidrectionally synchronizing this just works. A change is made in eDirectory by something (maybe HR? Maybe some other system) and it is sent to Active Directory, and then a few seconds later Active Directory sends it back to eDirectory on the Publisher channel.

Well as it is ready to write to eDirectory the engine does a quick query, checks the current values are realizes there is no work needed to be done, and the event loop dies right there. Phew. This would get really hairy quickly otherwise.

As I noted in the case of passwords this happens on every change. You will notice this if you watch trace in an Active Directory driver on password changes.

The entire password change process out of eDirectory is itself complicated and you would be advised to consider reading these articles that try to explain what is happening, and specifically focusing on how nspmDistributionPassword is used to do password changes:

But this step of optimize modify nulling out the looping back change is a critical part. Specifically in the password case, where a requirement of uniqueness could quickly compromise everything if Optimize modify was not available. That is, you would loop it back and write it to eDir, which would error with an NMAS error of: 16048 0xFFFFC153 NMAS_E_ENTRY_EXISTS or the like, since this password has been used before (why just a few seconds ago) and is not a unique password change.

So if this is all so great and useful, benefiting performance, why is there any confusion? Well as I said above, when an attribute simply flows in both directions on both channels, (Pub Sync and Sub Sync flags in the filter) it is pretty straightforward. Where you get into trouble is when an attribute is being modified in the flow, and will be different in each system.

Thus perhaps you have a multi valued in eDirectory (lets say Telephone Number) and you are mapping that to a single valued attribute in Active Directory, perhaps by flattening it (concatenate all the values into one? Comma separate them? Take only the first?) when it gets sent to Active Directory it no longer looks the same so you need to figure out some way to make sure that this compare does not happen (perhaps by not Pub synchronizing it? Perhaps in your ITP splitting it back into a multi valued attribute). But you can see how this adds complexity to the process.

There is actually a slightly more advanced and complicated downside as well. If you have say a loopback driver that actually lets the data loop through, you need to have an actual, non-optimized change allowed through, if you want it to get past the Publisher Event Transform.

If Optimize modify decides that nothing has changed, it will not get passed to the Publisher Command Transformation rule. The problem with this is if you are expecting to process the event in the Publishers Command Transform, you do not get the opportunity.

Again this can be worked around by picking some attribute in the filter and turning off Optimize Modify to get it past the filter, and then deal with the consequences of the various possible values in policy. I.e. If you turn off Optimize Modify on an attribute the responsibility for ensuring that the attribute value being set does not currently exists rests upon your policy.

The filter in Identity Manager turns out to be very powerful and as you can see, some of the simple seeming items has a great deal of affect on the end result.

Now having complicated the issue, lets try to simplify it. In general, Optimize Modify is a good thing! Leave it on, unless you have a need to turn it off.

When you are modifying attributes in the flow that synchronize in both directions it is important to consider optimize modify as part of the process.

The good news is that Optimize Modify's results will be shown in DSTrace so you can see it happen. Event discarded due to optimization is one possible message, and you will see it looking at eDirectory to decide what to do based upon Optimize Modify.

If you are not familiar with DSTrace, please look at part 3 of this article series: Common Mistakes Newcomers to IDM Make - Part 3

And then consider reading this great series on DSTrace by a guy at Novell Technical Services, which is really the best reference on DSTrace I have seen.

That's about if for now in this article. If you have any ideas for other things that have confused you as you get started with Identity Manager, let me know (Comment on any of my articles, send a message via Cool Solutions, post in the forums, whatever) and if I think I can write something useful about the topic, I will do it.

I have one item left on my list of ideas for this series, describing what Merge Authority means. The help on it is really lousy, and it would be good to see it properly written out.

Please pass on any additional ideas!


How To-Best Practice
Comment List