Code Map refresh constantly restarts
IDM Engine 4.7.2
Our user application codemap refresh restarts repeatedly and will fail more often than it completes. We have quite a few objects in our AD environment that are checked during the refresh. One suggestion from MF was to increase the ldap timeout in /etc/opt/novell/edir location. That seems to have helped but we are still seeing multiple codemap refreshes. Today specifically we have had 17 restarts of the codemap refresh. Any one else seeing this kind of behavior?
How many objects is "quite a few"? I've seen code map refresh work fine with 50K+ objects, and fail completely with 80K+ objects. I haven' t dug in to it recently, but it used to have a hard timeout of 10 minutes from the time UserApp sent the query request until it gave up and stopped waiting for the reply to come back.
I'd wager over 80k. We have tried changing the timeout from eDir,seems like the refresh restarts repeatedly and then will either fail or complete. It used to get stuck "in progress" but changing the time out helped. It also seems to randomly start in the middle of the morning. We specifically restarted the refresh at night but it doesnt seem to adhere to the 24hr standard it used to.
What David had to do when he ran into a too long query, he basically gathered the data into the vault somewhere (Options are, as a large XML blob in an attribute, or as representative objects) then intercept the CMR query and replace it with the data already there, much simpler and ready to go...
So you run your job when you want, cron like scheduling, and gather all the data in advance. Then CMR simply reads simple things that it can do in less than the timeout value.
In the UI, you can adjust the timeout on how long CMR runs for. That's across all entitlement enabled drivers. The other timeout, which you can't adjust, is how long UserApp waits for a CMR query / reply to be processed by any one driver. That was ten minutes, IIRC.
Also, you could only schedule UserApp to CMR every X hours. It starts by doing one on startup, then every X hours thereafter. I found that inconvenient.
What you'd see happen was UserApp would send the CMR query down the driver's subscriber channel. Then it would wait. The driver would query the domain, and get back 80K+ group objects. While the driver was working through this big glob of XML, UserApp would time out and stop waiting. So by the time the driver actually returned the results, UserApp no longer was there, so nothing would happen. Then it would have to do it again.
Most of the time consumed here was in processing the returned XML from the domain.
So, like Geoff says, I replaced it.
Step one was catching a CMR in trace so I could see what it did. Step two was replacing the CMR schedule with a Job. That allowed me a real cron-like schedule to say when it was actually allowed to do this.
Step three was replacing UserApp's CMR query with my own.
So I took the Job query and used it to build an XML blob on an object. Then when UserApp did its query, I replaced it, and used the saved XML blob to reply more or less immediately.
Intercepting the CMR query is actually quite easy. You catch it in the OTP, and change the target to the __driver_identification_query__ which always returns fast (Built into shim code) and you add an op-property with your identifier.
Then in the ITP you see the __driver_identification_result__ event, with your op-property and then clone by XPATH the data you need into that event, and let it get out of the OTP and then UA gets the CMR result it desires.
Do this all the time for custom entitlements, since often I do not want to query teh dest for the data.
Oh I know, I have this implemented in a Package for eDir Groups in the IDV as entitlements. I.e. Make IDV groups, available as entitlements to grant via a resource.
I'm pretty sure that I didn't use the driver identification query. I didn't want to reply to that on driver startup. I think I replaced the CMR query with a query for the domain root object (from GCV) / object class. Probably with some op-data to tag it for recognition on the reply.
Watching the Java heap balloon as the driver tries to process CMR query and reply was kinda fun. Attach a Java debugger to the JVM and have a look. (I wonder if I still have the images from that somewhere.)
Same difference. What you do is tag the query with Op-Data so you can tell it is your CMR query to intercept, so either it is domain root as you did, or the Driver ID query. It is a question of which you consider faster, and since Driver ID is shim code internal, no query, nice and easy to use.
Doesn't matter what you query for, anything works so long as you can catch to tag it, so you can catch to intercept it.