Encodings

There are a lot of different character encodings that describe how characters of some specific alphabet are encoded in single or multi-byte codes. The editor window text pane is used to display the textual representation of the document. The text it displays may of course be interpreted according to one or another encoding.

Hex Editor Neo allows you to choose the editor window's encoding from a wide set of supported encodings. As long as an encoding is a property of individual editor window, you can set different encoding for each editor window, even if they represent the same document.

Below is a full list of supported encodings.

NOTE: Support for a specific encoding depends on installed Windows code pages and fonts. If required components cannot be found for a selected encoding, the “Encoding not supported” text is displayed instead of document's data. Typing in text pane is also disabled until you select another, supported encoding.

Encoding	Encoding	Encoding
Default ANSI	Default OEM	UTF-8
ANSI - Arabic	ANSI - Baltic	ANSI - Central European
ANSI - Cyrillic	ANSI - Greek	ANSI - Hebrew
ANSI - Latin I	ANSI - Turkish	Arabic - ASMO 449+, BCON V4
Arabic - ASMO 708	Arabic - Transparent Arabic	Arabic - Transparent ASMO
ISO 2022 Japanese JIS X 0201-1989	ISO 2022 Japanese with halfwidth Katakana	ISO 2022 Japanese with no halfwidth Katakana
ISO 2022 Korean	ISO 2022 Simplified Chinese	ISO 2022 Traditional Chinese
ISO 6937 Non-Spacing Accent	ISO 8859-1 Latin I	ISO 8859-15 Latin 9
ISO 8859-2 Central Europe	ISO 8859-3 Latin 3	ISO 8859-4 Baltic
ISO 8859-5 Cyrillic	ISO 8859-6 Arabic	ISO 8859-7 Greek
ISO 8859-8 Hebrew	ISO 8859-8 Hebrew	ISO 8859-9 Latin 5
IBM EBCDIC - Arabic	IBM EBCDIC - Cyrillic (Russian)	IBM EBCDIC - Cyrillic (Serbian, Bulgarian)
IBM EBCDIC - Denmark/Norway	IBM EBCDIC - Denmark/Norway (20277 + Euro symbol)	IBM EBCDIC - Finland/Sweden
IBM EBCDIC - Finland/Sweden (20278 + Euro symbol)	IBM EBCDIC - France	IBM EBCDIC - France (20297 + Euro symbol)
IBM EBCDIC - Germany	IBM EBCDIC - Germany (20273 + Euro symbol)	IBM EBCDIC - Greek
IBM EBCDIC - Hebrew	IBM EBCDIC - Icelandic	IBM EBCDIC - Icelandic (20871 + Euro symbol)
IBM EBCDIC - International	IBM EBCDIC - International (500 + Euro symbol)	IBM EBCDIC - Italy
IBM EBCDIC - Italy (20280 + Euro symbol)	IBM EBCDIC - Japanese Katakana Extended	IBM EBCDIC - Korean Extended
IBM EBCDIC - Latin 1/Open System	IBM EBCDIC - Latin America/Spain	IBM EBCDIC - Latin America/Spain (20284 + Euro symbol)
IBM EBCDIC - Latin-1/Open System (1047 + Euro symbol)	IBM EBCDIC - Modern Greek	IBM EBCDIC - Multilingual/ROECE (Latin-2)
IBM EBCDIC - Thai	IBM EBCDIC - Turkish	IBM EBCDIC - Turkish (Latin-5)
IBM EBCDIC - U.S./Canada	IBM EBCDIC - U.S./Canada (037 + Euro symbol)	IBM EBCDIC - United Kingdom
IBM EBCDIC - United Kingdom (20285 + Euro symbol)	ISCII Assamese	ISCII Bengali
ISCII Devanagari	ISCII Gujarati	ISCII Kannada
ISCII Malayalam	ISCII Oriya	ISCII Punjabi
ISCII Tamil	ISCII Telugu	MAC - Arabic
MAC - Croatia	MAC - Cyrillic	MAC - Greek I
MAC - Hebrew	MAC - Icelandic	MAC - Japanese
MAC - Korean	MAC - Latin II	MAC - Roman
MAC - Romania	MAC - Simplified Chinese (GB 2312)	MAC - Thai
MAC - Traditional Chinese (Big5)	MAC - Turkish	MAC - Ukraine
OEM - Arabic	OEM - Baltic	OEM - Canadian-French
OEM - Cyrillic (primarily Russian)	OEM - Greek (formerly 437G)	OEM - Hebrew
OEM - Icelandic	OEM - Latin II	OEM - Modern Greek
OEM - Multilingual Latin I	OEM - Multilingual Latin I + Euro symbol	OEM - Nordic
OEM - Portuguese	OEM - Russian	OEM - Turkish
OEM - United States	Japanese (Katakana) Extended	Japanese (Latin) Extended and Japanese
JIS X 0208-1990 & 0121-1990	Korean (Johab)	Korean Extended and Korean
Simplified Chinese	Simplified Chinese (GB2312)	Simplified Chinese Extended and Simplified Chinese
Russian - KOI8-R	T.61	TCA - Taiwan
TeleText - Taiwan	Ukrainian (KOI8-U)	US/Canada and Japanese
US/Canada and Traditional Chinese	US-ASCII (7-bit)	Wang - Taiwan
CNS - Taiwan	Eten - Taiwan	EUC - Japanese
EUC - Korean	EUC - Simplified Chinese	EUC - Traditional Chinese
Europa 3	HZ-GB2312 Simplified Chinese	IA5 German (7-bit)
IA5 IRV International Alphabet No. 5 (7-bit)	IA5 Norwegian (7-bit)	IA5 Swedish (7-bit)
IBM5550 - Taiwan

Working with Encodings

The current editor window's encoding is displayed on the status bar:

Encodings

Text pane displays text data according to selected encoding. When you type new data on the keyboard (with text pane active), typed characters are processed according to selected encoding.

When the editor window is displaying data in Hex Words or Decimal Words view type, the UNICODE (UTF-16) encoding is automatically selected (as the text pane displays UNICODE data in these modes).

To change the current window's encoding, open the shortcut menu, select “Encoding” item and choose an encoding from the list. The list consists of “Default ANSI”, “Default OEM”, 5 recently used encodings and the “Other” item. Selecting the Other item opens a full list of supported encodings.

UTF-8 Support

UTF-8 is the first (and only, for now) multi-byte encoding supported by the editor.

The editor provides the full support for UTF-8 encoding. It not only displays the text encoded in UTF-8, but also allows you to type new data in this encoding. When you type, entered characters are converted on-the-fly and a single entered character may occupy up to 4 bytes.

When a character occupies several bytes, a space character is displayed in all but the last cell (in Text View). The last cell displays the character itself.

UTF-8 encoding defines strict rules for encoding UNICODE characters into single, two, three or four bytes. If these rules are broken and Hex Editor Neo cannot decode the character, it displays the ‘?’ character for each cell that contains invalid data.

All editor features, such as Find, Fill and so on are compatible with current editor window's encoding, and, therefore, are capable of working with UTF-8 as well.

Scripting

All encodings are available for scripts as Encodings enumeration. Use the IDocumentView.encoding property to query or set the editor window's encoding.