Troubleshooting OSP in IDM 4.5 - Part 1


Updated Dec 19, 2023, fixed a link that pointed at old version of community.

With the release of the gemstone projects for IDM 4.02, and built into IDM 4.5 NetIQ changed the method for logging into the Identity Applications. The Identity Applications are generically considered the set of:

IDM Reporting  (/IDMRPT)
IDM User App (/IDMProv)
IDM Home and Provisioning Dashboard (/landing and /dash)
IDM Catalog Administrator (/rra)
Self Service Password Reset (/sspr)
Access Review (/ar)

The common front end to manage logins for Single Sign On (SSO) is now NetIQ's One SSO Provider (OSP). OSP seems like it was originally developed to do SAML federation simpler than the entire infrastructure of NAM (NetIQ Access Manager). The Cloud Access, Mobile Access, and Social Access appliances which are all federation based products use OSP for that purpose. With the update the IDM front end web interfaces they decided to stay consistent on the SSO component. NAM is a much more capable solution that can do much more than just Federation so if you have NAM do not think OSP is here to replace or supplant it. NAM and OSP can federate to each other (Reverse Proxy is a bit more problematic, a colleague of mine is working on getting that figured out).

In principle this means OSP should be very good at doing Federation over SAML, since Cloud Access is by definition all about Federation over SAML. In practice it appears there is a limitation in place that makes it harder than needed. OSP also supports a Kerberos method for SSO, so that your local workstations Active Directory login can provide you a ticket to SSO into OSP protected applications. As I was spelunking through the configuration, I think I found where the overall OSP configuration was set, and if I read it right, was just limited by configuration. But not knowing the real configuration options, it would be hard to try and change those settings.

Regardless of the history, OSP is the way forward. As with any technology change we need to know what things look like when they are working so it is easier to spot when things are not working. Also seeing examples of what is not working and what to do next can be very helpful. I have tried doing articles like this for various IDM drivers over the years and that seems to be helpful. Let me thus try it for an OSP configuration as well.

You install OSP when you install the IDM Identity Apps, or when you install SSPR. Interestingly SSPR has its own login front end, and if you just drop the sspr.war file in the Tomcat deploy directory it will handle front end authentication on its own. The added benefit is that it integrates with the OSP SSO model, if that is of utility to your environment.

OSP does an LDAP bind against your directory, and so my first issue was failing to login. I had installed what I thought should work, got to the OSP login page ( ) but every account I tried failed. Well, I know it uses LDAP to login (because I had to specify and configure it during the installation) so lets look at the LDAP side of eDirectory.

I went to the eDirectory server specified as the LDAP source, on port 8028/8030 (http/https) logged into eDirectory, went to DSTRACE and enabled the LDAP flag. Tried a login and dstrace showed:

12:29:35 705F7700 LDAP: Connection 0xe674700 closed
12:29:57 8FAE8700 LDAP: New cleartext connection 0xe674700 from, monitor = 0x8ead8700, index = 1
12:29:57 6D4C3700 LDAP: ( DoBind on connection 0xe674700
12:29:57 6D4C3700 LDAP: ( Bind name:cn=admin,ou=sa,o=system, version:3, authentication:simple
12:29:57 6D4C3700 LDAP: ( Rejecting unencrypted bind on cleartext port in nds_back_bind, err = 13
12:29:57 6D4C3700 LDAP: ( Sending operation result 13:"":"" to connection 0xe674700
12:29:57 8EAD8700 LDAP: Monitor 0x8ead8700 found connection 0xe674700 socket closed, er

Well there you go, I was making an LDAP clear text bind to eDirectory, which was of course configured to not allow those, since they are a potential security risk. The error "Rejecting unencrypted bind on cleartext port in nds_back_bind, err = 13" is the giveaway.

Ok, that should be easy to fix, tell OSP and SSPR to use encrypted LDAP for talking to eDirectory. That has to be the most common configuration since passwords are involved. How hard can it be?

Now in theory, I should have (and could have) used to make this change. But I actually had two boxes setup. One running OSP, SSPR, and the full set of Identity Apps (minus Reporting). The second server had OSP and SSPR installed. But I was curious to see where the information was actually stored.

Now at the end of my experience I would say for sure, try and use since I also found a number of files buried all around that had IP information inside them. Interestingly, most of the OSP stuff looks like it used the variables defined higher up instead of hard coding paths, which was clever.

I was looking in this file:


And found these lines in it.

com.netiq.idm.osp.ldap.use-ssl = true =
com.netiq.idm.osp.ldap.port = 636

Also looks like it is stored in this file the same way:


Next up, was a common LDAP issue, since it failed after I made this change. I needed to get the tree CA's public key into the JVM keystore as a trusted CA source. (This is the -trustcacerts switch in keytool). The JVM ships, like your browser, like Windows (Well technically that is Internet Explorers keystore), like .NET, and other devices with a whole bunch of well known certificate authorities. This way, if the certificate is signed by one of these well known CA's you will trust the signing CA out of the box.

Alas, LDAP from eDirectory, https:// from iMonitor, KMO for SSL on Remote Loader connections, and other SSL services offered by eDirectory use certificates signed by the eDirectory Tree Certificate Authority. When you install the first server in your eDirectory tree, the CA is created and of course every tree is special with a unique CA certificate. (Every tree is special, every tree is precious... Anyone else imagining the horrible Monty Python skit?)

Thus you need to make sure that the various JVM's being used (you know there is more than 1, come on, one JVM on a machine? Ha! Like that will ever happen. Though IDM 4.5 does a pretty job of trying).

By now, you should be familiar with this. When you set up a Remote Loader, you probably had to get the tree CA's public key, so that your SSL connection would work. Same process.

Easiest way I know is to open iManager, connect to the tree, Novell Certificate Access, chose any Certificate in the list that is not expired, and select it via the tick box. Then from the menu select Export. Chose the Organizational CA from the two choices (other is the actually the SSL certificates Private and Public key. You can tell, since it offers the possibility for exporting the Private key. That is the giveaway that you are on the wrong cert). Save the file as both DER and Base64 since you will sometimes need one, sometimes the other. (DER is base64 decoded version of the B64 file. And the converse).

Once you have the CA's public key, it is easy to import. Keytool is your friend here. Learn to use keytool as you will be using it a LOT with OSP.

You should use keytool from the JVM you are running your apps from (Though the 1.7 JVM keytool allows for new features that 1.6 JVM supports). For OSP, that is /opt/netiq/idm/apps/jre/bin/keytool but I will just type keytool in my examples for simplicity.

keytool -keystore /opt/netiq/idm/apps/jre/lib/security/cacerts -storepass changeit -import -trustcacerts -file /tmp/cert.der 

The well known password on cacerts is 'changeit' and if that is not it, try 'default'. What else is new? Before you get upset, realize that the thing is, this keystore has no real secrets, only public keys, so who cares if you can get at the contents.

As a side note, to get OSP working with SAML you can read my article on getting it working with Shibboleth as a SAML IDP Configuring IDM 4.5s OSP to talk to a Shibboleth IDP I mention you need a private key in a second keystore (since THAT is an actual secret worthy of protection) and the public key from the IDP's metadata as a trustcacerts. Thus you are pretty likely to need to use keytool a fair bit.

Once I imported the tree CA into the cacerts file, restarted the application ( /etc/init.d/idmapps_tomcat_init restart ), the LDAP error went away.

OSP redirect to logout URL:

In this case after I got the previous errors sorted out, every time I went to I got redirected to the logout URL. That seemed a bit odd. Luckily I had two servers I had installed this stuff on, and one seemed to work, one failed this way so I could compare and figure it out.

Started looking around on the good server and noticed:

In the path /opt/netiq/idm/apps/tomcat/webapps/sspr

[root@netiq sspr]# ll
total 24
drwxrwxr-x 2 novlua novlua 4096 Feb 4 14:22 META-INF
drwxrwxr-x 7 novlua novlua 4096 Feb 4 14:23 WEB-INF
drwxrwxr-x 2 novlua novlua 4096 Feb 4 14:22 config
-rw-rw-r-- 1 novlua novlua 1731 Sep 26 06:59 index.jsp
drwxrwxr-x 3 novlua novlua 4096 Feb 4 14:22 private
drwxrwxr-x 5 novlua novlua 4096 Feb 4 14:22 public

That is the deploy directory for the sspr.war. Looks good on this server. Now lets check the bad server, where instead I see:

[root@netiq2 sspr]# ll
total 4
drwxrwxr-x 3 novlua novlua 4096 Feb 12 12:10 public

Ya, that does not look good at all. It does not look like Tomcat deployed the WAR files for SSPR. All that is in that public folder are my customized skin work.

If I go up one directory, one obvious difference is root owns the two WAR files.

total 42956
drwxrwxr-x 3 novlua novlua 4096 Feb 6 10:51 ROOT
drwxrwxr-x 6 novlua novlua 4096 Feb 12 12:10 osp
-rwxrwxr-x 1 root root 11520213 Feb 12 12:10 osp.war
drwxrwxr-x 3 novlua novlua 4096 Feb 12 12:10 sspr
-rw-rw-r-- 1 root root 32451797 Feb 12 12:10 sspr.war

That is easy enough to fix. Change the owner and group to the novlua user and novlua group.

root@netiq2 webapps]# chown novlua:novlua osp.war
[root@netiq2 webapps]# chown novlua:novlua sspr.war

Delete the sspr and osp deploy directories and restart Tomcat. Then it started working again. Some of this was because I was applying some patches by copying them in as root and I forgot to fix the permissions.

One thing I said above is worth mentioning. I noted that since it did not look right I was able to guess what might be wrong. But how did I know what is right and what is wrong? That is why I like articles like this, so I can show you what I think is right, and you can see how it differs from your environment. They get dated quickly as versions change, but with some thought, this can still help.

Continuing on that thread, how do you know SSPR is working? What should it look like? With Tomcat as the web application server, the output from User App, SSPR, and other logging is in the catalina.out file from Tomcat.

That file is usually found as /opt/netiq/idm/apps/tomcat/logs/catalina.out.

As of the initial release of IDM 4.5 there is a bug in SSPR that logs everything, no matter how you set the logging level, which is nice to see, but probably more than you want to see on a regular basis.

There is lots to see in the log, and it is mostly readable and understandable.

2015-02-12 13:04:44,784 [localhost-startStop-1] INFO  org.apache.catalina.startup.HostConfig- Deployment of web application archive /opt/netiq/idm/apps/tomcat/webapps/sspr.war has finished in 19,432 ms

There will be a line like this for each WAR file that gets deployed. 19 seconds seems like a long time, but as you read it, you realize that the first time SSPR loads it compiles an indexed list of words in the password exclusion list.

2015-02-12 13:05:12,280 [localhost-startStop-1] INFO  org.apache.catalina.startup.HostConfig- Deployment of web application directory /opt/netiq/idm/apps/tomcat/webapps/ROOT has finished in 3,136 ms

ROOT.war is a sort of empty WAR that renders the page when you go to with no directory. You see a Apache Tomcat info page. Interestingly if you need to host a file, you can drop it into the ROOT folder in the webapps path referenced above.

2015-02-12 13:05:12,283 [main] INFO  org.apache.coyote.http11.Http11Protocol- Starting ProtocolHandler ["http-bio-8080"]
2015-02-12 13:05:12,310 [main] INFO org.apache.coyote.http11.Http11Protocol- Starting ProtocolHandler ["http-bio-8443"]
2015-02-12 13:05:12,318 [main] INFO org.apache.coyote.ajp.AjpProtocol- Starting ProtocolHandler ["ajp-bio-8009"]
2015-02-12 13:05:12,332 [main] INFO org.apache.catalina.startup.Catalina- Server startup in 129598 ms

This 129 seconds is a lot but it was the first load, and all the work that needs to be done the first time is more than a normal restart.

On my bad server one was showing it start in 771ms, way too fast which was also a giveaway. That is not normal.

Tried copying in the WAR file from other server, still a 692ms deploy. But then I noticed in the catalina.out:

2015-02-12 13:21:58,755 [ContainerBackgroundProcessor[StandardEngine[Catalina]]] INFO  org.apache.catalina.startup.HostConfig- Undeploying context [/sspr]
2015-02-12 13:21:59,376 [ContainerBackgroundProcessor[StandardEngine[Catalina]]] ERROR org.apache.catalina.startup.ExpandWar- [/opt/netiq/idm/apps/tomcat/webapps/sspr/public/resources/themes/acme] could not be completely deleted. The presence of the remaining files may cause problems

Remember that this was probably caused by the root ownership of the WAR files, so the undelete did not work. So I deleted the sspr directory, and recopied the file in, which causes tomcat to redeploy it, and now I see:

2015-02-12 13:22:51,219 [localhost-startStop-3] INFO  org.apache.catalina.startup.HostConfig- Deploying web application archive /opt/netiq/idm/apps/tomcat/webapps/sspr.war
2015-02-12 13:22:57,538 [localhost-startStop-3] WARN password.pwm.config.ConfigurationReader- configuration settings have been modified since the file was saved using the Configuration Editor
2015-02-12 13:22:57,566 [localhost-startStop-3] DEBUG password.pwm.PwmApplication- successfully initialized default console log4j config at log level INFO
2015-02-12T13:22:57Z, INFO , pwm.PwmApplication, created directory /opt/netiq/idm/apps/tomcat/webapps/sspr/WEB-INF/logs

And you can see it load the config:

2015-02-12 13:23:39,789 [SSPR-E7D76C4259077A52-ContextManager timer] DEBUG password.pwm.ContextManager- configuration file was loaded from /opt/netiq/idm/apps/tomcat/webapps/sspr/WEB-INF/SSPRConfiguration.xml

That is about it for now. You can see there are lots of fun errors, and how some of them are pretty understandable. I hope this helps someone out on the problem as they run into it. I have been collecting these messages and have more to share in future articles in this series.


How To-Best Practice
Comment List