IMPORTANT NOTICE:  As of 21 February, anything posted in this community WILL NOT BE MIGRATED to our new community site.   We have pulled all user information and data from this site and are now in an approximately week long process of importing users, roles, and data to our new site.  We are leaving this site open and active so you can post and hopefully get a response until the migration is complete.   Once complete, the URL  that currently takes you to this site will take you to our new site so your bookmarks will work as always.  Please read the information on a New Login Process

  • State Suggested Answer
  • Date
  • Date 11 Jan 2019 9:02
  • Replies 8 replies
  • Answers 1 answer
  • Subscribers 501 subscribers
  • Views 614 views
Product Documentation

CXML-PARSE-FILE and XML encoding

I have an XML with windows-1252 encoding.

The first row of XML file is:

<?xml version="1.0" encoding="windows-1252"?>

When the instruction call "C$XML" using CXML-PARSE-FILE, filename is invoked, it return error "Invalid XML file..."

There is a "Cobol" way to change the encoding from windows-1252 to UTF-8?

Thanks

  • It seems like you want to overwrite the encoding that exists in this file, is that correct?
    Does the file open in a browser? If a browser shows an error, then something is wrong with the XML file. Why do you suspect the encoding is the issue? You could download the file, open it with an editor, change the encoding and save the file - if you do this, does your CALL work?
    I don't believe you set (or reset) this present encoding without parsing the data and making a new file.
    When you create a file using C$XML, you can use the CXML-SET-ENCODING function.

  • In reply to shjerpe:

    Exact. I'm trying to open the XML file created by third parties to read it... but C$XML return error during CXML-PARSE-FILE. The XML file results a valid XML file in Notepad++, but if I open the file with IE it is treated as a text file. The solution that you propose would be to recreate it using the CXML-SET-ENCODING?
  • In reply to vale:

    You could use the alternative of XML Extensions.  

    However, since you simply wish to change the encoding, I would suggest using XSLT with an identity transform to change the encoding.  You would specify the new encoding on the <xsl:output> instruction.  Example below.  This transformation step could be run using CALL "SYSTEM" prior to parsing the XML document.   The exact format of the command line would depend on the XSL processor you use; for Windows you can download msxsl.exe from Microsoft.

    <?xml version="1.0" ?>
    <xsl:stylesheet version="1.0"
       xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output encoding="utf-8"/> <!-- IdentityTransform --> <xsl:template match="/ | @* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()" /> </xsl:copy> </xsl:template> </xsl:stylesheet>


  • In reply to Tom Morrison:

    HI
    I have a similar problem. I receive xml files (containing invoice information, as required starting from 1/1/19 by italian law) that I examine using C$XML.
    No problem when these files are utf-8 encoded, but when they are windows-1252 encoded C$XML cannot parse them.
    But I can't modify the orginal files, because they have fiscal validity and must be kept as received.
    Does anybody have a solution ?
    Thank you
  • In reply to gbaruzzo:

    Here is a simple XML file:

    <?xml version="1.0" encoding="windows-1252"?>
    <stuff>
    <test1>test1</test1>
    <test2>test2</test2>
    </stuff>

    Here is an Acu COBOL program made to read that data:
    IDENTIFICATION DIVISION.
    PROGRAM-ID. "testtrace".
    ENVIRONMENT DIVISION.
    DATA DIVISION.
    WORKING-STORAGE SECTION.

    01 stuff.
    05 test1 pic x(10).
    05 test2 pic x(10).

    01 pause pic x.

    Copy "lixmlall.cpy".
    PROCEDURE DIVISION.
    A.
    XML INITIALIZE.
    If Not XML-OK Go To Z.

    XML IMPORT FILE
    stuff
    "stuff.xml"
    "stuff".
    If Not XML-OK Go To Z.

    display "Test1: " test1.
    display "Test2: " test2.

    accept pause.

    stop run.

    Z.
    Perform Display-Error-Status
    XML TERMINATE
    accept pause.
    Stop Run.

    Display-Error-Status.

    copy "lixmltrm.cpy".
    copy "lixmldsp.cpy".

    END PROGRAM "testtrace".

  • In reply to shjerpe:

    Thank you for your quick answer.
    I understand I have to abandon C$XML to read these windows-1252 encoded files...
  • In reply to gbaruzzo:

    For the same problem I used a .NET module that converts XML from Windows-1252 to UTF-8.
  • In reply to gbaruzzo:

    For now, yes. We've been made aware of this issue for Italy and we're looking into whether we can change our XML parser. For now, XML extensions allow you to parse the file.