There are a lot of different character encodings that describe how characters of some specific alphabet are encoded in single or multi-byte codes. The editor window text pane is used to display the textual representation of the document. The text it displays may of course be interpreted according to one or another encoding.
Hex Editor Neo allows you to choose the editor window's encoding from a wide set of supported encodings. As long as an encoding is a property of individual editor window, you can set different encoding for each editor window, even if they represent the same document.
Below is a full list of supported encodings.
NOTE: Support for a specific encoding depends on installed Windows code pages and fonts. If required components cannot be found for a selected encoding, the “Encoding not supported” text is displayed instead of document's data. Typing in text pane is also disabled until you select another, supported encoding.
Encoding | Encoding | Encoding |
---|---|---|
Default ANSI | Default OEM | UTF-8 |
ANSI - Arabic | ANSI - Baltic | ANSI - Central European |
ANSI - Cyrillic | ANSI - Greek | ANSI - Hebrew |
ANSI - Latin I | ANSI - Turkish | Arabic - ASMO 449+, BCON V4 |
Arabic - ASMO 708 | Arabic - Transparent Arabic | Arabic - Transparent ASMO |
ISO 2022 Japanese JIS X 0201-1989 | ISO 2022 Japanese with halfwidth Katakana | ISO 2022 Japanese with no halfwidth Katakana |
ISO 2022 Korean | ISO 2022 Simplified Chinese | ISO 2022 Traditional Chinese |
ISO 6937 Non-Spacing Accent | ISO 8859-1 Latin I | ISO 8859-15 Latin 9 |
ISO 8859-2 Central Europe | ISO 8859-3 Latin 3 | ISO 8859-4 Baltic |
ISO 8859-5 Cyrillic | ISO 8859-6 Arabic | ISO 8859-7 Greek |
ISO 8859-8 Hebrew | ISO 8859-8 Hebrew | ISO 8859-9 Latin 5 |
IBM EBCDIC - Arabic | IBM EBCDIC - Cyrillic (Russian) | IBM EBCDIC - Cyrillic (Serbian, Bulgarian) |
IBM EBCDIC - Denmark/Norway | IBM EBCDIC - Denmark/Norway (20277 + Euro symbol) | IBM EBCDIC - Finland/Sweden |
IBM EBCDIC - Finland/Sweden (20278 + Euro symbol) | IBM EBCDIC - France | IBM EBCDIC - France (20297 + Euro symbol) |
IBM EBCDIC - Germany | IBM EBCDIC - Germany (20273 + Euro symbol) | IBM EBCDIC - Greek |
IBM EBCDIC - Hebrew | IBM EBCDIC - Icelandic | IBM EBCDIC - Icelandic (20871 + Euro symbol) |
IBM EBCDIC - International | IBM EBCDIC - International (500 + Euro symbol) | IBM EBCDIC - Italy |
IBM EBCDIC - Italy (20280 + Euro symbol) | IBM EBCDIC - Japanese Katakana Extended | IBM EBCDIC - Korean Extended |
IBM EBCDIC - Latin 1/Open System | IBM EBCDIC - Latin America/Spain | IBM EBCDIC - Latin America/Spain (20284 + Euro symbol) |
IBM EBCDIC - Latin-1/Open System (1047 + Euro symbol) | IBM EBCDIC - Modern Greek | IBM EBCDIC - Multilingual/ROECE (Latin-2) |
IBM EBCDIC - Thai | IBM EBCDIC - Turkish | IBM EBCDIC - Turkish (Latin-5) |
IBM EBCDIC - U.S./Canada | IBM EBCDIC - U.S./Canada (037 + Euro symbol) | IBM EBCDIC - United Kingdom |
IBM EBCDIC - United Kingdom (20285 + Euro symbol) | ISCII Assamese | ISCII Bengali |
ISCII Devanagari | ISCII Gujarati | ISCII Kannada |
ISCII Malayalam | ISCII Oriya | ISCII Punjabi |
ISCII Tamil | ISCII Telugu | MAC - Arabic |
MAC - Croatia | MAC - Cyrillic | MAC - Greek I |
MAC - Hebrew | MAC - Icelandic | MAC - Japanese |
MAC - Korean | MAC - Latin II | MAC - Roman |
MAC - Romania | MAC - Simplified Chinese (GB 2312) | MAC - Thai |
MAC - Traditional Chinese (Big5) | MAC - Turkish | MAC - Ukraine |
OEM - Arabic | OEM - Baltic | OEM - Canadian-French |
OEM - Cyrillic (primarily Russian) | OEM - Greek (formerly 437G) | OEM - Hebrew |
OEM - Icelandic | OEM - Latin II | OEM - Modern Greek |
OEM - Multilingual Latin I | OEM - Multilingual Latin I + Euro symbol | OEM - Nordic |
OEM - Portuguese | OEM - Russian | OEM - Turkish |
OEM - United States | Japanese (Katakana) Extended | Japanese (Latin) Extended and Japanese |
JIS X 0208-1990 & 0121-1990 | Korean (Johab) | Korean Extended and Korean |
Simplified Chinese | Simplified Chinese (GB2312) | Simplified Chinese Extended and Simplified Chinese |
Russian - KOI8-R | T.61 | TCA - Taiwan |
TeleText - Taiwan | Ukrainian (KOI8-U) | US/Canada and Japanese |
US/Canada and Traditional Chinese | US-ASCII (7-bit) | Wang - Taiwan |
CNS - Taiwan | Eten - Taiwan | EUC - Japanese |
EUC - Korean | EUC - Simplified Chinese | EUC - Traditional Chinese |
Europa 3 | HZ-GB2312 Simplified Chinese | IA5 German (7-bit) |
IA5 IRV International Alphabet No. 5 (7-bit) | IA5 Norwegian (7-bit) | IA5 Swedish (7-bit) |
IBM5550 - Taiwan |
The current editor window's encoding is displayed on the status bar:
Text pane displays text data according to selected encoding. When you type new data on the keyboard (with text pane active), typed characters are processed according to selected encoding.
When the editor window is displaying data in Hex Words or Decimal Words view type, the UNICODE (UTF-16) encoding is automatically selected (as the text pane displays UNICODE data in these modes).
To change the current window's encoding, open the shortcut menu, select “Encoding” item and choose an encoding from the list. The list consists of “Default ANSI”, “Default OEM”, 5 recently used encodings and the “Other” item. Selecting the Other item opens a full list of supported encodings.
UTF-8 is the first (and only, for now) multi-byte encoding supported by the editor.
The editor provides the full support for UTF-8 encoding. It not only displays the text encoded in UTF-8, but also allows you to type new data in this encoding. When you type, entered characters are converted on-the-fly and a single entered character may occupy up to 4 bytes.
When a character occupies several bytes, a space character is displayed in all but the last cell (in Text View). The last cell displays the character itself.
UTF-8 encoding defines strict rules for encoding UNICODE characters into single, two, three or four bytes. If these rules are broken and Hex Editor Neo cannot decode the character, it displays the ‘?’ character for each cell that contains invalid data.
All editor features, such as Find, Fill and so on are compatible with current editor window's encoding, and, therefore, are capable of working with UTF-8 as well.
All encodings are available for scripts as Encodings enumeration. Use the IDocumentView.encoding property to query or set the editor window's encoding.