Highlighted
Super Contributor.
Super Contributor.
7500 views

XBIS and CR/LF

We're using XBis 9.2 under Windows; we receive a text string that contains embedded CR/LF characters.  In looking at the XML, the CR/LF characters are there; however, when the text string is passed to Cobol, it seems the CR/LF characters have been stripped out of the text.


Is this normal behaviour, or something that can be changed via a setting, etc?

Thanks

Tony

0 Likes
20 Replies
Highlighted
Knowledge Partner
Knowledge Partner

RE: XBIS and CR/LF

Hi Tony,

Is this a web service (i.e. SOAP)?  Can you post the XML?

There are a number of factors that may be in play here.  If the CR/LF characters form their own text node in the XML, that is considered whitespace and whitespace is commonly stripped during an XSLT transform.  It is something that can be fixed, but the nature of the incoming XML dictates the nature of the fix.


Tom Morrison
Consultant

0 Likes
Highlighted
Super Contributor.
Super Contributor.

RE: XBIS and CR/LF

Hi, Tom;

Yes, this is a web service.  The web page has a multi-line box, in which the user can press <enter> to indicate a new line.

Here's the pertinent part of the XML - the tag in question is "OrderComments" - there are CRLF characters after "comment line 1", after 'comment line 2", after "comment line 3", and after each of the digits 4, 5, 6,, 7, 8 abd 9.

<bis:request xmlns:bis="www.xcentrisity.com/.../request&quot;>

- <bis:content>

- <env:Envelope xmlns:xsd="www.w3.org/.../XMLSchema&quot; xmlns:xsi="www.w3.org/.../XMLSchema-instance&quot; xmlns:tns="localhost/.../&quot; xmlns:env="schemas.xmlsoap.org/.../&quot;>

- <env:Body>

- <tns:ValidateOrder>

 <patientname>KimComments14</patientname>

 <orderdate>20141009</orderdate>

 <wanteddate />

 <PONumber />

 <addresses />

 <shippostalcode />

 <types>SV</types>

 <styles>SV</styles>

 <materials>P</materials>

 <treatments />

 <colors />

 <rightsphere>1.00</rightsphere>

 <rightcylinder />

 <rightdecent>2.00</rightdecent>

 <rightvertdec />

 <rightthickness />

 <leftsphere>1.00</leftsphere>

 <leftcylinder />

 <leftdecent>2.00</leftdecent>

 <leftvertdec />

 <leftthickness />

 <rightprism1 />

 <leftprism1 />

 <rightcribdiam>50</rightcribdiam>

 <leftcribdiam>50</leftcribdiam>

 <rightbasecurve />

 <rightdiameter />

 <leftbasecurve />

 <leftdiameter />

 <addon />

 <lensselection />

 <real-order-id />

 <orderaddon />

 <righttype>SV</righttype>

 <lefttype>SV</lefttype>

 <rightstyle>SV</rightstyle>

 <leftstyle>SV</leftstyle>

 <rightmaterial>P</rightmaterial>

 <leftmaterial>P</leftmaterial>

 <righttreatment />

 <lefttreatment />

 <rightcolor />

 <leftcolor />

- <shipaddress>

 <shipaddressItems />

 </shipaddress>

 <OrderComments>comment line 1 comment line 2 comment line 3 4 5 6 7 8 9 final comment line 10</OrderComments>

 <jobtype>u</jobtype>

 <leftpresent>1</leftpresent>

 <rightpresent>1</rightpresent>

 <token>9843ad15-4fb0-11e4-bfd6-00016c705cae</token>

 </tns:ValidateOrder>

 </env:Body>

0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

RE: XBIS and CR/LF

Do you have wrap="hard" on your <textarea> element?


Tom Morrison
Consultant

0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

RE: XBIS and CR/LF

Tony and I had a direct email exchange on this.  Bottom line is that, using MSXML the soap_to_cobol.xsl transform is preserving the CR/LF as expected, so the issue is somewhere beyond that in the import process  Tony is probably going to take this to SupportLine.


Tom Morrison
Consultant

0 Likes
Highlighted
Absent Member.
Absent Member.

RE: XBIS and CR/LF

XML Extensions strips control characters, which means 0x09, 0x0a and 0x0d since these are the only control characters allowed in an XML text document, from the value before the value is imported into a COBOL data item.

0 Likes
Highlighted
Absent Member.
Absent Member.

RE: XBIS and CR/LF

My regrets; the prior post I made is incorrect and should be ignored.  I mistook some tracing code, which does remove TAB, LF and CR from values when tracing them.  This is only done for the trace output.  I'm  still not sure why Tony is not getting these characters in COBOL.  Still reviewing ...

0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

RE: XBIS and CR/LF

Along this track, Bruce, would setting the attribute xml:space="preserve" override the behavior?  This goes back to Tony's original post, where he asks how this behavior can be defeated. 


Tom Morrison
Consultant

0 Likes
Highlighted
Absent Member.
Absent Member.

RE: XBIS and CR/LF

I have verified with a simple test case that TAB, CR and LF values are not preserved by XML Extensions.  I have not yet found where this happens in the process.  The xml:space="preserve" attribute has no effect (something I have long had suspicions of being a failure in XML Extensions).  Still looking for where this happens.  I will probably write an RPI.

0 Likes
Highlighted
Absent Member.
Absent Member.

RE: XBIS and CR/LF

I found two answers to the whitespace issue.  Answer one is that XML Extensions intentionally strips characters less than space before copying the XML text node value to the COBOL data item; there is no setting to prevent this.  Answer two is that MSXML6 (the Windows parser) doesn't seem to return tab, line feed or carriage return for pChildNode->get_nodeValue() where pChildNode is a TEXT_NODE; MSXML6 doesn't replace them with a space but rather simply deletes them.  Libxml2 (the UNIX parser) does return whitespace characters, but then as noted in answer one, XML Extensions removes characters less than space before the transfer to the COBOL data item.

I found that XML Extensions sets the MSXML6 preserveWhiteSpace property to TRUE when loading a stylesheet and to FALSE when loading any other document.  Changing this to TRUE for all documents did not cause pChildNode->get_nodeValue() to keep tab, linefeed and carriage return characters, even when the xml:space="preserve" attribute was specified in the XML.  (There might be a flaw in my experiment because setting preserveWhiteSpace property to TRUE (FALSE is the default) for stylesheets was specifically added to fix a problem with transform output when a stylesheet was inserting whitespace.  This fix worked for the specific case for which it was intended.)

0 Likes
Highlighted
Absent Member.
Absent Member.

RE: XBIS and CR/LF

I determined that my MSXML6 experiment was indeed flawed.  The RM/COBOL DISPLAY on Windows doesn't show tabs, line feeds and carriage returns, but does on UNIX.  I further found that the whitespace characters are preserved regardless of either xml:space="preserve" or setting the preserveWhiteSpace property before loading a document.  This indicates that there just needs to be a setting for XML Extensions to preserve characters below space, which for XML documents are tab, line feed and carriage return.  Note that for Windows, bare line feeds and carriage returns are converted by MSXML6 to carriage return line feed pairs.

0 Likes
Highlighted
Absent Member.
Absent Member.

RE: XBIS and CR/LF

In reviewing what might need to be done to XML Extensions for better whitespace handling, I noted that the XML specification requires that CR/LF and a bare CR be replaced by a single LF in the XML processor.  Thus, if your design depends on preserving CR characters, XML is not appropriate except in the case where replacing LF on output with CR/LF is the desired behavior.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.