InterviewSolution
| 1. |
Why Does My File Encoding On Output Not Match The Encoding On Input? |
|
Answer» The default character encoding used by XMLOutputter is UTF-8, a variable-length encoding that can REPRESENT all Unicode CHARACTERS. This can be changed with a call to format.SET Encoding () on the Format object passed to XMLOutputter. It would be nice if XMLOutputter could default to the original encoding for a file, but unfortunately parsers don't INDICATE the original encoding. You have to set it programmatically. This issue most often affects people with documents in the common ISO-8859-1 (Latin-1) encoding who use characters like ñ but aren't familiar with having to think about encodings. The tip to remember is that with these documents you must set the output encoding to ISO-8859-1, otherwise characters in the RANGE 128-255 will be output using a double byte encoding in UTF-8 instead of the normal single byte encoding of ISO-8859-1. The default character encoding used by XMLOutputter is UTF-8, a variable-length encoding that can represent all Unicode characters. This can be changed with a call to format.set Encoding () on the Format object passed to XMLOutputter. It would be nice if XMLOutputter could default to the original encoding for a file, but unfortunately parsers don't indicate the original encoding. You have to set it programmatically. This issue most often affects people with documents in the common ISO-8859-1 (Latin-1) encoding who use characters like ñ but aren't familiar with having to think about encodings. The tip to remember is that with these documents you must set the output encoding to ISO-8859-1, otherwise characters in the range 128-255 will be output using a double byte encoding in UTF-8 instead of the normal single byte encoding of ISO-8859-1. |
|