Page 1 of 1

chinese language and poi36

Posted: Wed Dec 22, 2021 8:44 am
by lucius
Hi Scott
I have to import an excel spreadsheet with some chinese text

I have an Iseries with a chinese partition (double byte) where I normally handle chinese characters

for the first time I should import in my DB an excel with some chinese text.

The latin characters are ok, but the chinese text is not correct.. I get only ??????? into the field

This is what I'm using to read char from excel:
cell = SSRow_GetCell(row: 3);
desartch = String_getBytes(SSCell_getStringCellValue(cell)

Do you have any idea??

thank you very much
Lucio
Italy

Re: chinese language and poi36

Posted: Wed Dec 22, 2021 7:16 pm
by Scott Klement
How is String_getBytes() defined in your program?

What is the job CCSID? Does it support the characters you are trying to receive?

Re: chinese language and poi36

Posted: Thu Dec 23, 2021 7:45 am
by lucius
here:

D String_getBytes...
D pr 1024A varying
D extproc(*JAVA:
D 'java.lang.String':
D 'getBytes')



cell = SSRow_GetCell(row: 9);
projdescr = String_getBytes(SSCell_getStringCellValue(cell));

from the PF definition:
A PROJDESCR 60O CCSID(937)


from the job:
Language identifier . . . . . . . . . . . . . . . : ENU
Country or region identifier . . . . . . . . . . : US
Coded character set identifier . . . . . . . . . : 65535
Default coded character set identifier . . . . . : 37

thank you!

Re: chinese language and poi36

Posted: Thu Dec 23, 2021 8:52 am
by Scott Klement
Your prototype for String_getBytes tells RPG to convert from the Java string format (which is in Unicode) to RPG data type A, which is EBCDIC.

Your job ccsid is 65535 -- which means "hex" or otherwise known as "do not translate this data". So the computer is saying to itself "Lucio asked me to translate to EBCDIC, but his EBCDIC CCSID is 'hex', which doesn't make sense. So I will translate to the default ccsid 37 instead.'

CCSID 37 does not contain Chinese characters, so it must replace them all with x'3F' (which will appear as ? on some displays.)\

If you want it to read Chinese successfully, you need to use character sets that contain chinese characters!

Re: chinese language and poi36

Posted: Thu Dec 23, 2021 10:16 am
by lucius
Thank you Scott

I also changed the CCSID of the job to
Coded character set identifier . . . . . . . . . : 937
Default coded character set identifier . . . . . : 937
but it does not work
I still get the ?

In you opinion it is not related the way Java reads the data but to the different combination of CCSID of original excel file and ccsid of the client?
thank you
ciao
Lucio

Re: chinese language and poi36

Posted: Fri Dec 24, 2021 8:32 am
by Scott Klement
The data in Java is in Unicode. When RPG calls a Java routine like String_getBytes() it will convert that data to EBCDIC. I thought it always converted to the job ccsid -- so if your job CCSID is set correctly, it should convert correctly. It sounds like you're telling me that it is not.

Perhaps you should forget about converting it to EBCDIC and convert it to Unicode instead.

To do that, instead of String_getBytes use a call like this:

Code: Select all

D String_getUCS2  pr         16383C   varying            
D                                     extproc(*JAVA:     
D                                     'java.lang.String':
D                                     'toCharArray')     
Then when you need to get the data, use that instead of String_getBytes...

Code: Select all

myUnicodeVar = String_getUCS2(SSCell_getStringCellValue(cell));
Since the output is unicode, it should support anything.

Re: chinese language and poi36

Posted: Mon Dec 27, 2021 9:13 am
by lucius
Thank you Scott
we are not far from the right result!

Now I got some chinese char but it seems that there is a sort of "translation" between standard chinese char and traditional chinese char.
At this point maybe is the excel file which has been created with a different chinese char set. (?)

Chinese is a difficult matter to treat...

Thank you

ciao

Lucio

Re: chinese language and poi36

Posted: Tue Dec 28, 2021 4:38 pm
by Scott Klement
Sorry, I'm not sure what would cause that.

If possible, I would use a Java debugger (RDi can debug Java) to see what values POI is getting out of the spreadsheet before it returns them to RPG. This should tell you where the problem is occurring.

Re: chinese language and poi36

Posted: Fri Feb 11, 2022 11:08 am
by lucius
ciao Scott
I'm back again....

I have the same problem of chinese chars in WRITING out an excel spreadsheet...

I have a file in AS400 with the correct chinese chars.. I can see them in a subfile or sql select

But when I create the xlsx they become strange chars... not for sure chinese chars!

Do you have any suggestions?

thank you very very much for you precious help..

ciao
Lucio