What International Encodings Are Supported By Xerces-j?

1.	What International Encodings Are Supported By Xerces-j?
Answer» In general, the parser supports all IANA encodings and aliases (seehttp://www.iana.org/assignments/character-sets) that have clear mappings to Java encodings (see here for details). Some of the more common encodings are: UTF-8 UTF-16 BIG ENDIAN, UTF-16 Little Endian IBM-1208 ISO Latin-1 (ISO-8859-1) ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech, Hungarian, POLISH, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian] ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto] ISO Latin-4 (ISO-8859-4) ISO Latin Cyrillic (ISO-8859-5) ISO Latin Arabic (ISO-8859-6) ISO Latin Greek (ISO-8859-7) ISO Latin Hebrew (ISO-8859-8) ISO Latin-5 (ISO-8859-9) [Turkish] Extended Unix Code, packed for Japanese (euc-jp, eucjis) Japanese Shift JIS (shift-jis) Chinese (big5) Chinese for PRC (mixed 1/2 byte) (gb2312) Japanese ISO-2022-JP (iso-2022-jp) Cyrllic (koi8-r) Extended Unix Code, packed for KOREAN (euc-kr) Russian Unix, Cyrillic (koi8-r) Windows Thai (cp874) Latin 1 Windows (cp1252) cp858 EBCDIC encodings: o EBCDIC US (ebcdic-cp-us) o EBCDIC Canada (ebcdic-cp-ca) o EBCDIC Netherland (ebcdic-cp-nl) o EBCDIC Denmark (ebcdic-cp-dk) o EBCDIC NORWAY (ebcdic-cp-no) o EBCDIC Finland (ebcdic-cp-fi) o EBCDIC Sweden (ebcdic-cp-se) o EBCDIC Italy (ebcdic-cp-it) o EBCDIC Spain, Latin America (ebcdic-cp-es) o EBCDIC Great Britain (ebcdic-cp-gb) o EBCDIC France (ebcdic-cp-fr) o EBCDIC Hebrew (ebcdic-cp-he) o EBCDIC Switzerland (ebcdic-cp-ch) o EBCDIC Roece (ebcdic-cp-roece) o EBCDIC Yugoslavia (ebcdic-cp-yu) o EBCDIC Iceland (ebcdic-cp-is) o EBCDIC Urdu (ebcdic-cp-ar2) o Latin 0 EBCDIC o EBCDIC Arabic (ebcdic-cp-ar1) In general, the parser supports all IANA encodings and aliases (seehttp://www.iana.org/assignments/character-sets) that have clear mappings to Java encodings (see here for details). Some of the more common encodings are:

Answer»

In general, the parser supports all IANA encodings and aliases (seehttp://www.iana.org/assignments/character-sets) that have clear mappings to Java encodings (see here for details).

Some of the more common encodings are:

UTF-8
UTF-16 BIG ENDIAN, UTF-16 Little Endian
IBM-1208
ISO Latin-1 (ISO-8859-1)
ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech, Hungarian, POLISH, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]
ISO Latin-4 (ISO-8859-4)
ISO Latin Cyrillic (ISO-8859-5)
ISO Latin Arabic (ISO-8859-6)
ISO Latin Greek (ISO-8859-7)
ISO Latin Hebrew (ISO-8859-8)
ISO Latin-5 (ISO-8859-9) [Turkish]
Extended Unix Code, packed for Japanese (euc-jp, eucjis)
Japanese Shift JIS (shift-jis)
Chinese (big5)
Chinese for PRC (mixed 1/2 byte) (gb2312)
Japanese ISO-2022-JP (iso-2022-jp)
Cyrllic (koi8-r)
Extended Unix Code, packed for KOREAN (euc-kr)
Russian Unix, Cyrillic (koi8-r)
Windows Thai (cp874)
Latin 1 Windows (cp1252)
cp858
EBCDIC encodings:
o EBCDIC US (ebcdic-cp-us)
o EBCDIC Canada (ebcdic-cp-ca)
o EBCDIC Netherland (ebcdic-cp-nl)
o EBCDIC Denmark (ebcdic-cp-dk)
o EBCDIC NORWAY (ebcdic-cp-no)
o EBCDIC Finland (ebcdic-cp-fi)
o EBCDIC Sweden (ebcdic-cp-se)
o EBCDIC Italy (ebcdic-cp-it)
o EBCDIC Spain, Latin America (ebcdic-cp-es)
o EBCDIC Great Britain (ebcdic-cp-gb)
o EBCDIC France (ebcdic-cp-fr)
o EBCDIC Hebrew (ebcdic-cp-he)
o EBCDIC Switzerland (ebcdic-cp-ch)
o EBCDIC Roece (ebcdic-cp-roece)
o EBCDIC Yugoslavia (ebcdic-cp-yu)
o EBCDIC Iceland (ebcdic-cp-is)
o EBCDIC Urdu (ebcdic-cp-ar2)
o Latin 0 EBCDIC
o EBCDIC Arabic (ebcdic-cp-ar1)

In general, the parser supports all IANA encodings and aliases (seehttp://www.iana.org/assignments/character-sets) that have clear mappings to Java encodings (see here for details).

Some of the more common encodings are:

What International Encodings Are Supported By Xerces-j?

Discussion

No Comment Found

Related InterviewSolutions

Reply to Comment