[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode problem
Hi John,
> Regarding whether or not it's UTF-8, the only reason I thought it was that
> is because the person that generated the file told me it was that.
In order to proceed at all with helping you, I need the following
question answered. It's important:
Is the file you're parsing, byte-for-byte, the same as the one you
posted here?
> I have no other proof that it's UTF-8. I also know that when I used
> CCSID 1208, the file parsed correctly.
If you view the file in hex, I think you'll see that it's not UTF-8.
The easiest example is the DSPF green-screen tool. Type:
DSPF 'freight_201003101542591443916.xml'
Then hit F10=Display Hex.
You should notice that it starts with 'FF FE', which is the byte-order
mark for UTF-16 LE. This is followed by '3C 00 3F 00 78 00'. You'll
notice that every alternating byte is set to 00. That's because each
character in the XML is represented by *two* bytes. So '3C 00'
represents the < character. '3F 00' represents the ? character.
The fact that the zero is in the 2nd byte (as opposed to 00 3F) tells
you that it's little-endian.
The fact that it's two bytes (or 16 bits) per character tells you that
this is a 16-bit encoding. (Not an 8-bit encoding!)
This is either UCS-2LE or UTF-16LE. There's no chance that this is UTF-8.
>
> Do you know what the CCSID is for UTF-16 (LE). If so, I can try it.
>
The CCSID for UTF-16 (aka UTF-16BE) is 1200. There is no CCSID for
UTF-16LE that works on IBM i that I'm aware of. But there is a
different API that can translate it to UTF-8 or UTF-16. I posted this
in another message last night, please read that.
I don't understand what's going on here. To proceed, I really need your
help on two of the things I asked you previously:
1) Is the file the same as it was when you posted it, or has it changed
to UTF-8?
2) Did you try HTTP_XML_CALC as the CCSID? This lets Expat handle the
translation instead of OS/400 -- and Expat does natively support
UTF-16LE. What happens when you use that?
-----------------------------------------------------------------------
This is the FTPAPI mailing list. To unsubscribe, please go to:
http://www.scottklement.com/mailman/listinfo/ftpapi
-----------------------------------------------------------------------