Opening and Saving Documents with Unsupported Characters
When loading documents, Oxygen XML Author reads the document prolog to determine the specified encoding type. This encoding is then used to instruct the Java Encoder to load support for and to save the document using the specified code chart. When the encoding type cannot be determined, Oxygen XML Author prompts and display the Available Java Encodings dialog box that provides a list of all encodings supported by the Java platform.
If the opened document contains an unsupported character, Oxygen XML Author applies the policy specified for handling such errors. If the policy is set to REPORT, Oxygen XML Author displays an error dialog box with a message about the character not allowed by the encoding. If the policy is set to IGNORE, the character is removed from the document displayed in the editor panel. If the policy is set to REPLACE, the character is replaced with a standard replacement character for that encoding.
While in most cases you are using UTF-8, simply changing the encoding name causes the application to save the file using the new encoding.
When saving a document edited in the Text, Grid, or Design modes, if it contains characters not included in the encoding declared in the document prolog, Oxygen XML Author detects the problem and signals it to the user. The user is responsible to resolve the conflict before saving the document.
When saving a document edited in the Author mode, all characters that fall outside the detected encoding will be automatically converted to hexadecimal character entities.
To edit documents written in Japanese or Chinese, change the font to one that supports the specific characters (a Unicode font). For the Windows platform, Arial Unicode MS or MS Gothic is recommended. Do not expect WordPad or Notepad to handle these encodings. Use applications such as Internet Explorer or Word to examine XML documents.
When a document with a UTF-16 encoding is edited and saved in Oxygen XML Author, the saved document has a byte order mark (BOM) that specifies the byte order of the document content. The default byte order is platform-dependent. That means that a UTF-16 document created on a Windows platform (where the default byte order mark is UnicodeLittle) has a different BOM than a UTF-16 document created on a Mac OS platform (where the byte order mark is UnicodeBig). The byte order and the BOM of an existing document are preserved when the document is edited and saved. This behavior can be changed in Oxygen XML Author from the Encoding preferences panel.