Decoding and understanding iDOCs from SAP with no tools. SAP the HR and Finance system supports a flexible data export model they call iDOCS, Intermediate documents. The SAP web site will tell you how amazingly powerful and well designed a format they use. I am sure once you are an SAP guru and have spent the time learning everything there is to know about the system that will be true. Lets assume for a moment you are not a SAP guru, nor do you care that much to learn enough SAP to make the iDOC format blindingly obvious. This may be a reasonable assumption if you are working on an Identity Manager system that wants to integrate with the SAP HR system, using the driver from Novell. You need to know about SAP and iDOCs but perhaps not at the same level an SAP developer would. This driver is configured to use iDOCs, generated by SAP for events, dumped into a directory, and processed in sequence to get events out of SAP and onto the Publisher channel. The Subscriber channel uses a different protocal called BAPI to send events back into SAP. That is a whole other can of worms, involving Jconnect libaries and an entire other set of technologies. Back to our poor fellow tasked with the SAP HR driver, looking at the resulting iDOCs that are not generating the events expected. What next? Well trace is helpful, but at some point we need to understand what is in the raw file. No doubt SAP has a tool set to do this, in a pretty graphical way, but odds are the SAP team will not give you access to it, and I still like to be able to read the document if possible myself. Call me crazy, but it makes me feel empowered. Lets see what we can deconstruct about iDOCS based on an example. Here are two lines from a sample iDOC for a User that was generated for the SAP-HR IDM driver configuration. (Warning: iDOCS use very long line lengths! Into the hundreds of characters! May be hard to see clearly). E2P0105002 2900000100011546187000048010026021234567801050001 999912312004010100020071207USRNAME 0001USRNAME E2P0105002 2900000100011546187000048010026021234567801050010 999912312007010100020070926USRNAME 0010 USERNAME@ACME.COM xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxPERNRxxxINFTSUBTxxy1234567812345678SEQ12345678USERNAMEUSERHIROIIPPFFFFRRRRUTYP This looks incomprehensible gibberish. (Make sure you turn off line wrapping, as these are VERY long lines!) Syntax for the attributes in the mapping table look something like this: Internet EMail Address P0105:USRID_LONG:0010:108:241 CN P0105:USRID:0001:78:30 To make life easier I have minimized the size of the example by only showing two lines of the iDOC for two attributes as an example. Next thing we need to look at is the HRMD_A05.meta or HRDM_A06.meta (Whatever version your SAP environment is using, JCOTest from the documentation will tell you which version is in use). For more info on this topic, read the manual, it tells you that you get the 05 version with the driver, and how to generate a new one if you need too. That is pretty straightforward. On Linux the files reside in the /usr/lib/dirxml/rules/saphr directory. Since both attributes I am talking about here are in the P0105 infotype, I can show just a snippet of the HRMD_A06.meta file. (All the infotypes are in the file, so it can be VERY long!) SEGMENT:P0105:E1P0105: P0105:PERNR:0:8 P0105:INFTY:8:4 P0105:SUBTY:12:4 P0105:OBJPS:16:2 P0105:SPRPS:18:1 P0105:ENDDA:19:8 P0105:BEGDA:27:8 P0105:SEQNR:35:3 P0105:AEDTM:38:8 P0105:UNAME:46:12 P0105:HISTO:58:1 P0105:ITXEX:59:1 P0105:REFEX:60:1 P0105:ORDEX:61:1 P0105:ITBLD:62:2 P0105:PREAS:64:2 P0105:FLAG1:66:1 P0105:FLAG2:67:1 P0105:FLAG3:68:1 P0105:FLAG4:69:1 P0105:RESE1:70:2 P0105:RESE2:72:2 P0105:USRTY:74:4 P0105:USRID:78:30 P0105:USRID_LONG:108:241 There should be enough hints in there to leave this as an excersize to the reader to solve. Just kidding of course. Next piece of information that makes this solvable is what is the user's Person Number? In this case it is 12345678 (I made that up, but mainly so that it stands out to the eye). Reading the meta file, on the second line we see P0105:PERNR:0:8, a definition that P0105 infotype, field called PERNR (Person Number?), offset of 0 and length of 8. What this tells us is that everything up the person number, does not yet count! That means the beginning stack of characters in the iDOC, E2P0105002 290000010001154618700004801002602 for this infotype can basically be ignored. When you are trying to define a new Infotype or the like, you need to remember this. The first 63 or so characters do not count. Be careful as there are at least two variants of that number, so look at a sample and count for your self. Ok, so now the iDOC we care about looks like the following shorter lines: 1234567801050001 999912312004010100020071207USRNAME 0001USRNAME 1234567801050010 999912312007010100020070926USRNAME 0010 USERNAME@ACME.COM So we know that the Person Number starts at index 0, and runs 8 characters. So that makes it 12345678. Great drop that snippet and we get a simpler doc: 01050001 999912312004010100020071207USRNAME 0001USRNAME 01050010 999912312007010100020070926USRNAME 0010 USERNAME@ACME.COM Next on the list is INFTY (P0105:INFTY:8:4), Infotype, starting at position 8, for 4 characters long, which we sort of already knew was 0105. Then we get SUBTY (P0105:SUBTY:12:4), Subtype, starting at position 12 and running for 4 chars. This is good, as we need to know the infotype and subtype for our Schema map declaration. I have no idea what the next two lines stand for and do, but they do not matter much. P0105:OBJPS:16:2 P0105:SPRPS:18:1 Two important ones come next, the timestamps: P0105:ENDDA:19:8 P0105:BEGDA:27:8 Enddate and Begin date, that look like, 9999123120040101 and 9999123120070101 from our example. Those are actually a pair of 8 character date strings, but I was too lazy to run through this twice. A date of 9999 12 31 is infinity, or December 21, year 9999, which ever comes first i.e. an end date of never Then we have the begin dates, of 20040101 and 20070101, Jan 1, 2004 and 2007 respectivly. We can keep going through this line by line, but I hope you see the point by now. What we actually care about is are the last two entries in the meta file. This is because we want to get the User name and the Email address out of SAP. Those are the two fields we need to map in Identity Manager, so lets look at them next. So the meta says: P0105:USRID:78:30 P0105:USRID_LONG:108:241 The IDM Schema map says: Internet EMail Address P0105:USRID_LONG:0010:108:241 CN P0105:USRID:0001:78:30 The missing value is the Subtype, which can be the literal string 'none', as a valid value. The meta says that the SUBTY for each value is from offset 12, length of 4, so that tells me they are 0001 and 0010 in our two examples. Then I have the full schema path I need for setting it in the driver. I am sure there is much more to know about iDOCS, but this is enough to get you rolling in the right direction. There are commercail tools for looking at an iDOC and displaying it in a more readable fashion. But as far as I know there are no free, open source ones. If you know of any, please let me know. This way works, but is a smidgen tedious. I heard the perfect analogy to describe what I am looking for, as a tool. I am looking for something like XMLSpy that takes the plain text XML document but presents it in a better fashion, with nodes, and a better interface. Looking for something like that for iDOCs.