Getting Started with OSP - Part 3

0 Likes
over 5 years ago
In the first two articles in this series I discussed some of the basic things about getting OSP installed and configured so that the various web components of NetIQ Identity Manager working. OSP is the front end authentication module that support name/password, Kerberos, and SAML federation. Once logged in via OSP the various web modules use OAuth tickets issued by OSP to ensure you are properly authenticated.

To get OSP working, there are a number of issues that are not entirely obvious but very important. You need to have the URL perfectly correct. You need the proper certificates trusted in all the proper locations. There are two keystores, the main JVM keystore called cacerts and the OSP keystore you specify in configupdate.sh when you configure IDM/OSP. Both of those should trust the eDirectory Tree CA so LDAP works. Both of those should trust the OSP public key. (The OSP keystore has the private key so it of course trusts itself). To trust the public key you either trust the public key of the cert directly (if self signed) or the certification chain (the Certificate Authority (CA) that signed it, and any intermediate CAs as well.

There is another important detail that I have discussed before in articles here:
https://www.netiq.com/communities/cool-solutions/configuring-idm-4-5s-osp-talk-shibboleth-idp
https://www.netiq.com/communities/cool-solutions/troubleshooting-osp-idm-4-5
https://www.netiq.com/communities/cool-solutions/troubleshooting-osp-idm-4-5-part-2
https://www.netiq.com/communities/cool-solutions/troubleshooting-osp-sspr-part-3
https://www.netiq.com/communities/cool-solutions/troubleshooting-sso-user-application-4-02/

In the olden days, User Application would ask for a username and password at login, in its own login box, and then would authenticate you both to the User Application session, but also to eDirectory in the back end, so that all queries done in forms could happen with your permissions. This was a clever idea, since it meant that eDirectory privileges could be used to control what is seen in web forms. You already have the users, permissions model, so why not leverage it?

However when User Application started allowing SSO in I think version 4.0 or 4.01 it required a different way to authenticate the user to eDirectory since there was no longer a username/password available to do the login. The fairly clever solution they came up with was to use an NMAS module, that supports SAML. That is, eDirectory is configured to trust User Application, and if User Application says this guy is who he says he is, (Since OSP logged him in via Username/password, Kerberos, or SAML to some other IDP) then eDirectory will take User Applications word for it.

This is a bit confusing as now you potentially have a web of trust going on here.

OSP trusts the SAML IDP (Identity Provider) so the user logins in to an external web site (NAM, Shibboleth, something else) so the IDP vouches for the user to OSP.

User Application trusts OSP (which trusts the SAML IDP) so when the user goes to User Application or another web module, OSP vouches for the user to User Application.

eDirectory trusts User Application (which trusts OSP, which trusts the SAML IDP) so when the user logins to User Application they also login via SAML to eDirectory, and User Application vouches for the user to eDirectory.

As you can see, there is an interesting web of trust here, and in order to debug issues that arise you need to understand how all these trusts work.

Probably the most important thing to understand here is that each of those trusts has a timeout value. If they are different, then you are at the mercy of the shortest timeout to bring down the entire edifice. See my other articles listed above for notes on where each timeout is usually configured.

My recommendation if you are installing IDM with SAML federation for login via OSP is to first install the basics for OSP, which is Java, Tomcat, and OSP with SSPR. Then configure OSP and get SAML federation working and once you do that you are ready to install User Application and the other web applications.

The reason I do it in this order is that there a myriad set of little things that can go wrong and it is easiest to get OSP running and patched with as little using first. User Application adds a whole new level of complications to troubleshoot, and it is much easier to troubleshoot OSP without it there, at first. For example, every time you restart Tomcat (which you will be doing many times while you troubleshoot, after each change to test if it is working) with only OSP and SSPR installed it takes only a second or two to restart. With User Application, Dash, Landing, Catalog Access, and maybe Reporting installed it can take 10-20 seconds to restart making everything slower to troubleshoot.

If you want to see some good examples of troubleshooting OSP, check out the articles I listed above:
https://www.netiq.com/communities/cool-solutions/configuring-idm-4-5s-osp-talk-shibboleth-idp
https://www.netiq.com/communities/cool-solutions/troubleshooting-osp-idm-4-5
https://www.netiq.com/communities/cool-solutions/troubleshooting-osp-idm-4-5-part-2
https://www.netiq.com/communities/cool-solutions/troubleshooting-osp-sspr-part-3

In those examples I worked through the issues needed to get Shibboleth working with OSP and showed a bunch of error messages I ran into and explained the issues.

At the OSP level, there is some additional logging that can be enabled, but it takes a Tomcat restart to go into effect. Look in the /opt/netiq/idm/apps/tomcat/bin directory and there is setenv.sh script file. This sets a bunch of system variables that OSP uses. As part of the file, it specifies the start commands that will be used to start Java and Tomcat for OSP and all the other web applications installed. Tomcat allows you pass in parameters (Well maybe it is actually Java that allows you to pass in parameters that Tomcat can read) when you start the program.

There is a parameter with the value of INFO that needs to be set to a higher trace level.

The bottom of that file looks mostly like this. You can see a bunch of environmental settings are managed here in one place, which is quite nice. First up is the JAVA_OPTS which is how you specify the amount of RAM to use for Java. At the end of that line is a MaxPermSize setting, which as of JVM 1.8 is deprecated and you will see a complaint every time you restart Tomcat about it being deprecated. If this matters to you, just remove the -XX:MaxPermSize setting entirely, it is not longer needed in JVM 1.8.

JAVA_OPTS="-Xms1024m -Xmx1024m -XX:MaxPermSize=512m "
export JAVA_OPTS
export CATALINA_OPTS="-Dcom.netiq.ism.config=/opt/netiq/idm/apps/tomcat/conf/ism-configuration.properties -Dcom.netiq.osp.ext-context-file=/opt/netiq/idm/apps/osp_sspr/osp/osp-conf.jar -Dcom.netiq.idm.osp.logging.level=INFO -Dcom.netiq.idm.osp.client.host=corp-idmadm101.mlb.org -Dcom.netiq.idm.osp.tenant.logging.naudit.enabled=false -Dcom.netiq.idm.osp.logging.file.dir=${CATALINA_BASE}/logs -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 -Djavax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl -Didmuserapp.logging.config.dir=/opt/netiq/idm/apps/tomcat/conf/ -Dextend.local.config.dir=/opt/netiq/idm/apps/tomcat/conf/"


In the CATALINA_OPTS section there is a setting:
-Dcom.netiq.idm.osp.logging.level=ALL

This is the one that controls how much OSP logs. If you look at the Apache definition of trace levels for log4j, (https://logging.apache.org/log4j/2.x/manual/customloglevels.html) you will see that the order is basically:

OFF
FATAL
ERROR
WARN
INFO
DEBUG
TRACE
ALL

OSP ships with it set to INFO, which is pretty high and is sometimes useful. ALL is such crazy logging it is unbelievable. But at times it is helpful. It will enumerate every single certificate in the known keystores it uses (and the JVM comes with 90 or more certificates) which seems like a waste of space, until you are troubleshooting whether it was able to find the certificate you included or not. In which case, it is very helpful indeed.

Once you do that, you start seeing much more information in the OSP log file. This file is a sibling of catalina.out in the /opt/netiq/idm/apps/tomcat/logs directory and rolls every day with a new date stamp so you can see what happened in the past. What I am trying to say is that while I call the file name osp.log, it is really named:
osp-idm.2016-07-21.log

and the archived ones (logrotate seems to be rolling and compressing them) would look like:
osp-idm.2016-07-20.log.160721101307.archive.gz

This is a VERY verbose log file, and many of the things you see are not actually errors, just informational things that it tries, to probe and see if it is configured or not. So do not get concerned at things you see in there, try asking in the support forum, there is a lot of free expertise there.

There are few things you can derive from these files. First off, you get reminded that OSP was designed with a much more complex use case in mind. You see this all the time in the log with a message like:

[OSP]
Time: 2015-03-10T15:45:04.579-0400
Level: TRACE
Java Execution:
Class: com.novell.osp.OSPContext
Method: start
Line Number: -1
Thread: localhost-startStop-1
Message: OSP Start. Starting tenant:



This tells us that OSP is starting and starting the Tenant named 'idm'. This implies it is designed to handle more than one tenant at a time, a loftier goal than perhaps NAM's.

As it starts, we see some interesting log events:

 <KeyStore(uri.osp.xml.config.01.2011)>:
Use: Signing
KeyStore: /opt/netiq/idm/apps/osp_sspr/osp/osp.jks
Type: jks
Key Store Password: ********
Alias: osp
Key Password: ********

<KeyStore(uri.osp.xml.config.01.2011)>:
Use: Encrypting
KeyStore: /opt/netiq/idm/apps/osp_sspr/osp/osp.jks
Type: jks
Key Store Password: ********
Alias: osp
Key Password: ********

<KeyStore(uri.osp.xml.config.01.2011)>:
Use: SSL
KeyStore: /opt/netiq/idm/apps/osp_sspr/osp/osp.jks
Type: jks
Key Store Password: ********
Alias: osp
Key Password: ********

<KeyStore(uri.osp.xml.config.01.2011)>:
Use: SSLTrust
KeyStore: /opt/netiq/idm/apps/jre8/lib/security/cacerts
Type: jks
Key Store Password: ********
Alias:



As it starts, it reports a number of settings that are in place. If you have read my other articles on OSP you will know I talk about two keystores, but in fact it looks there are in fact 4 keystores in play. Signing, Encrypting, SSL, and SSLTrust. Now as it happens, the first three are all using the same settings, so that just means OSP can do more than IDM is using it for, which is all good.


<oxcfg:HTTPInterface(uri.osp.xml.config.01.2011)>:

Id: 1
Enabled: true
Resolvable: true
Domain Name: myid.acme.com

Port: 443
SSL: true

Path: osp
Cookie Domain: myid.acme.com

<oxcfg:HTTPInterface(uri.osp.xml.config.01.2011)>:

Id: 2
Enabled: true
Resolvable: true
Ip Address: 10.1.1.11

Port: 443
SSL: true

Path: osp
Cookie Domain: 10.1.1.11


You may recall I have stressed how important the proper DNS name is to OSP, and here we see it loading it, checking it via name resolution (DNS, or /etc/hosts) and noting the address returned. This is actually logged in catalina.out when OSP starts.

Then we see it log the certificates it reads out of the keystores:

[OSP]
Time: 2015-12-15T17:02:36.912-0500
Level: TRACE
Java Execution:
Class: com.novell.osp.OSPKeys
Method: logKeyStore
Line Number: -1
Thread: localhost-startStop-1
Message: KeyStore: Encryption Key Store
Alias: osp
Type: X.509
Issuer DN: CN=myid.acme.com
Subject DN: CN=myid.acme.com
Serial Number : 39123466
Alias: test-edir
Type: X.509
Issuer DN: O=ACME-TREE, OU=Organizational CA
Subject DN: O=ACME-TREE, OU=Organizational CA
Serial Number : 0ABCDEFGA07AEABB0D8D73CA32EEA91A45EC7B702026F095075


This is the contents of the Encryption keystore which above we saw was the osp.jks file, so basically in this example there are only two certificates in the keystore. The osp private key we are using for OSP and Tomcat, and the eDirectory tree CA's public key. Per my notes above this is not enough, so in this example we will likely have issues.

You can see that the SSL Key Store is the same (as we saw above) but also confirms that OSP is reading the file properly.

[OSP]
Time: 2015-12-15T17:02:36.913-0500
Level: TRACE
Java Execution:
Class: com.novell.osp.OSPKeys
Method: logKeyStore
Line Number: -1
Thread: localhost-startStop-1
Message: KeyStore: SSL Key Store
Alias: osp
Type: X.509
Issuer DN: CN=myid.acme.com
Subject DN: CN=myid.acme.com
Serial Number : 39123466
Alias: test-edir
Type: X.509
Issuer DN: O=ACME-TREE, OU=Organizational CA
Subject DN: O=ACME-TREE, OU=Organizational CA
Serial Number : 0ABCDEFGA07AEABB0D8D73CA32EEA91A45EC7B702026F095075


Then it goes on to read the cacerts keystore which has 90 or so public keys for well known CAs so I won't show the trace, but this is actually helpful, since you can now see if the certificates you wanted it to load were actually loaded or not.

Next I notice that there are actually even more keystores supported by OSP, an NIDP SSL, NIDP, OCSP, NIDP LDAP truststores. Again, these are all reusing the two keystores we defined but it is instructive to know that OSP supports much more complexity if the product needed it.

[OSP]
Time: 2015-12-15T17:02:36.931-0500
Level: TRACE
Java Execution:
Class: com.novell.osp.A
Method: initialize
Line Number: -1
Thread: localhost-startStop-1
Message: ** Loaded SSL Keystore : /opt/netiq/idm/apps/osp_sspr/osp/osp.jks
Signing Keystore : /opt/netiq/idm/apps/osp_sspr/osp/osp.jks
Encryption Keystore : /opt/netiq/idm/apps/osp_sspr/osp/osp.jks
NIDP SSL Truststore : /opt/netiq/idm/apps/jre8/lib/security/cacerts
NIDP OCSP Truststore : /opt/netiq/idm/apps/jre8/lib/security/cacerts
NIDP LDAP Truststore : /opt/netiq/idm/apps/jre8/lib/security/cacerts


OSP then nicely logs out the Java version. This is really helpful, since it is possible that for some kooky reason you are starting OSP in Tomcat with the wrong JVM, in which case your cacerts keystore will be the wrong one, and all sorts of issues can occur. The cacerts file is in the lib/security directory of the JVM, so if you imported it into Java 7, and updated to Java 8 it is likely to be missing. This along with tracing out every certificate in the keystores earlier can really help if you are having a goofy issue in this area.

It is always nice seeing a version number thrown out to be safe and validate all is as expected. Even more so, it shows that you are running Oracle Java, not perhaps an IBM Java instance, as is needed for WebSphere. (Of course OSP only is supported in Tomcat which should be using an Oracle Java instance). However, you could imagine where the path gets mixed up somehow and the server with WebSphere might also have Tomcat and the IBM JVM might load and cause issues.


[OSP]
Time: 2015-12-15T17:02:36.949-0500
Level: TRACE
Java Execution:
Class: com.novell.osp.util.net.client.OSP_SSLSocketFactory
Method: initialize
Line Number: -1
Thread: localhost-startStop-1
Message: Using SUN JSSE for JVM Vendor Version : Oracle Corporation1.8.0_65


As you can see the logging gets a little bit silly in terms of verbosity when you turn the level all the way up to eleven. While it may not actually fill your disk, it can come pretty close to it so running at this level is not a good idea in general. There are however real and useful troubleshooting cases where you might want to see this level of detail.

I have lots more interesting snippets from the logs and will walk through some more examples in the next few articles in this series. I would recommend that if you happen to find an interesting message in your work, snag it into a text file, save it and write about it here at Cool Solutions, so the next person can better understand the issue. (If you are feeling lazy, send it to me, or leave it in a comment to this article.) The more the merrier, there is a whole world of undocumented stuff here to try and understand and work with.

Labels:

How To-Best Practice
Comment List
Anonymous
Related Discussions
Recommended