Novell Identity Manager is a very complex product and monitoring it has always been a challenge.
You can approach monitoring it so many different ways that it is hard to find a nice generic solution that just works. If you search Cool Solutions community for Cool Tools you will find several monitoring notions and solutions. Here are some I found in a moment's searching.
You can see that some are standalone applications that you run, some are plugins for existing monitoring tools like Nagios. In principle you could use SNMP to monitor many things about the drivers via the SNMP interface to eDirectory as well.
What is nice is that this gives you a fair bit of flexibility in terms of making it work with your existing management infrastructure. If you are a Nagios shop, then the choice is obvious, look at the plugins If you are an HP Openview shop, look for SNMP ways to approach the problem.
The Nagios plugins actually lead into the point I am aiming for. What they basically are is a wrapper that calls dxcmd, the DirXML Command toolkit. This is a Java tool included for all platforms that Novell Identity Manager runs on. (Since Identity Manager is an application that runs as a Java application in the eDirectory memory space, it is safe to say it is available on all platforms that eDirectory runs on as well. I like to summarize that as, every platform that matters except Mac servers. (BSD people go find someone else to bother!)).
Dxcmd is menu driven (simplistic but functional menus) that let you see the state of all your drivers, stop or start them, and much more. The most powerful function available, is if the driver is stopped (the API's require this alas, it would be super cool if they update the API's to allow this while the driver is running, but alas, you will get a 641 error right now if you try with the driver running) you can examine the contents of the cache. (Aka the TAO file). There are some nice articles on this topic already at:
Alas, the interface for viewing events does not lend itself to easy use. You basically have to walk through the cache file, event by event, unless you know the exact offset, which is pretty much impossible to calculate.
This means that although it is 'possible' to look the cache contents, it is basically impractical on any scale.
Well one point I missed, because, actually I had not yet tried it, was the Identity Manager Dashboard. Boy did I miss a big one there!
This is most obvious, when you restart eDirectory, and you still have the plugins open in a browser window. While web page still responds to clicks, it has to wait to reconnect before it can display any data for you.
Anyway, with the new plugins, in the toolbar across the top of the page there is a new icon to the far right that shows the Identity Manager view, that hides the side panel that is normal in iManager. Initially I thought this was silly, since I usually need to pop back and forth between items in iManager, so why hide the bar, but it turns out, we need the screen real estate to show all the Identity Manager stuff, so I guess it makes sense.
There is a new option called Identity Manager Dashboard.
Look at the image for a second and I expect you will be pleased.
This is image is showing the drivers in the Driverset I selected to monitor. For each driver it is showing a green or red light for its current status. It shows us the TAO file, which I spent a fair bit of time explaining how to determine in a previous article, http://www.novell.com/communities/node/2206 in a much easier view. But even better this basically makes my previous article useless, since all the details and more are shown in this view.
We get to see the current size of the TAO file, and what is kind of neat it shows us the absolute size of the TAO file (which is all we would be able to see by looking via the file system) but more importantly (and previously unavailable) it shows us how much of that absolute size is currently unprocessed. As I have discussed in previous articles, the TAO file grows as event after event is appended to the end of the file, and does not reduce in file size, until the entire file is completely processed. So it is possible to have a 10 meg TAO file, with just one event left in it, that is perhaps stuck, or still processing. The file does not shrink since it is a much harder programmatic task to reduce the size, and perhaps it serves no real value to waste the effort and CPU time to do so, when the nature of an Identity Manager event is to finish fairly quickly and empty out of the TAO.
What this view shows us is how much is left to go in the file, which honestly is all we really care about. This is great as previously we had no way of finding this out. Next it shows us the events in the queue, listed by class of event. Now there are technically more event types than are listed in the view, but the reality is that ones shown are the big most common ones and the rest get lumped under Custom, which is perfectly fine, since things like Trigger are relatively rare.
What I do notice missing and am curious about is modify-password. Now technically, modify password is rarely an event in a TAO file, since if you read the article: http://www.novell.com/communities/node/1474/password-transformation-rule-sets you will see that a password change in eDirectory is almost always a modify event in the Subscriber channel of the attribute nspmDistributionPassword in eDirectory, that the Command Transform rules in most drivers see, and convert to a modify-password event. Coming in the other way, on the Publisher channel an application via the driver shim (remote or local) will send a modify-password event and the Command transform rules will see that, check the configuration set via Global Configuration Values, and process that into a modify event of nspmDistributionPassword. So I suppose, modify-password is never really seen in eDirectory, and thus would never be sitting in the TAO cache file.
Regardless, it is a very powerful view into what is going on in your Identity Manager solution. There are two menus you can see, Refresh and Actions. Actions allows you to modify the display by hiding Disabled drivers. (Disabled drivers are pretty boring since they are not really caching events so why show them? It is nice having an option to easily hide them from the view). You can control the number of columns. My example driver set has a lot of drivers, 30 plus, so I set it to show 4 columns across. (I have a 1920X1200 screen, nyah nyah, as I run it in! Go hi res or go home!) The default view is two columns wide, which is probably more appropriate for a typical screen. Hopefully as the snapins continue to mature we will get more options coming available. I personally would like to see an option to hide stopped drivers, but that can be a dangerous option. I.e. If you are relying this to see your environment, and stopped drivers are hidden, then you will not easily see when a driver stops inappropriately.
The other menu is Refresh, which allows you to set the page to update on regular intervals. Being a web page of course, you need to handle refresh events differently than in a local application.
Where this is most powerful is when you have a reasonable number of drivers in your Identity Manager solution, and there is a complex flow of events.
Lets say your JDBC driver gets events from your HR system and modifies a couple of object classes. Then you have a Loopback driver watching for some of those events (lets say generating a random password when a new user is created from HR, and sending it in an email to the user, or maybe his manager, or maybe the helpdesk). Then once the password is set, someone adds an entitlement for some service, based on some attribute HR set, so some other driver needs to event on it.
Sometimes you know that a backup in the Loopback driver means passwords for new users are not being set, and a back up in the JDBC driver means something else is stuck. This way, you get a single page view to see how many events are backed up and what driver has what stuck.
This is a great tool to give to the Helpdesk, so that they can get a feel for where the slow down in the processing of events might be happening, and helps give you an approach to start troubleshooting.
Depending on the complexity of your Identity Manager solution, that may be too much to expect of a helpdesk, but even so it provides a high level dashboard that has been sorely lacking for Identity Manager until now. I would personally like to thank the iManager snapin team for adding this truly excellent addition to our toolkit!
As useful as this is, coming soon with Identity Manager 3.6 are some even cooler features, that are exposed via the new snapins but will require the newest engine to make them available and work. One of those features is for cache browsing, alas, still requiring a stopped driver, but via the snapins, which will hopefully be much more practicle than using dxcmd is now. As I look at the image of the Dashboard, I realize I did not slide in any drivers that with a IDM 3.5.1 engine show the new health semaphore, which looks like a traffic light. With Actions, a new feature coming in 3.6, we will be able to define Actions to take, when a driver goes into a Red or Yellow state. Such as restart the driver, or send an email, or whatnot. Very powerful for maintaining a complex, multi driver system. No doubt there are many more that I have yet to run across and I am greatly looking forward to the coming release of 3.6.
Oops, and I realized later that actually with a IDM 3.5.1 engine server you can already use the iManager snapins to view the cache contents (still requires a stopped driver) on an event by event basis. Very cool!