ISO/IEC 2022 - ISO/IEC 2022 Character Sets

ISO/IEC 2022 Character Sets

Character encodings using ISO/IEC 2022 mechanism include:

  • ISO-2022-JP. A widely used encoding for Japanese. Starts in ASCII and includes the following escape sequences
    • ESC ( B to switch to ASCII (1 byte per character)
    • ESC ( J to switch to JIS X 0201-1976 (ISO/IEC 646:JP) Roman set (1 byte per character)
    • ESC $ @ to switch to JIS X 0208-1978 (2 bytes per character)
    • ESC $ B to switch to JIS X 0208-1983 (2 bytes per character)
  • ISO-2022-JP-1. The same as ISO-2022-JP with one additional escape sequence
    • ESC $ ( D to switch to JIS X 0212-1990 (2 bytes per character)
  • ISO-2022-JP-2. A multilingual extension of ISO-2022-JP. The same as ISO-2022-JP-1 with the following additional escape sequences
    • ESC $ A to switch to GB 2312-1980 (2 bytes per character)
    • ESC $ ( C to switch to KS X 1001-1992 (2 bytes per character)
    • ESC . A to switch to ISO/IEC 8859-1 high part, Extended Latin 1 set (1 byte per character)
    • ESC . F to switch to ISO/IEC 8859-7 high part, Basic Greek set (1 byte per character)
  • ISO-2022-JP-3. The same as ISO-2022-JP with three additional escape sequences
    • ESC ( I to switch to JIS X 0201-1976 Kana set (1 byte per character)
    • ESC $ ( O to switch to JIS X 0213-2000 Plane 1 (2 bytes per character)
    • ESC $ ( P to switch to JIS X 0213-2000 Plane 2 (2 bytes per character)
  • ISO-2022-JP-2004. The same as ISO-2022-JP-3 with one additional escape sequence
    • ESC $ ( Q to switch to JIS X 0213-2004 Plane 1 (2 bytes per character)
  • ISO-2022-KR. An encoding for Korean.
    • ESC $ ) C to switch to KS X 1001-1992, previously named KS C 5601-1987 (2 bytes per character)
  • ISO-2022-CN. An encoding for Chinese.
    • ESC $ ) A to switch to GB 2312-1980 (2 bytes per character)
    • ESC $ ) G to switch to CNS 11643-1992 Plane 1 (2 bytes per character)
    • ESC $ * H to switch to CNS 11643-1992 Plane 2 (2 bytes per character)
  • ISO-2022-CN-EXT. The same as ISO-2022-CN with six additional escape sequences
    • ESC $ ) E to switch to ISO-IR-165 (2 bytes per character)
    • ESC $ + I to switch to CNS 11643-1992 Plane 3 (2 bytes per character)
    • ESC $ + J to switch to CNS 11643-1992 Plane 4 (2 bytes per character)
    • ESC $ + K to switch to CNS 11643-1992 Plane 5 (2 bytes per character)
    • ESC $ + L to switch to CNS 11643-1992 Plane 6 (2 bytes per character)
    • ESC $ + M to switch to CNS 11643-1992 Plane 7 (2 bytes per character)

The character after the ESC (for single-byte character sets) or ESC $ (for multi-byte character sets) specifies the type of character set and working set that is designated to. In the above examples, the character ( (0x28) designates a 94-character set to the G0 character set. This may be replaced by ), * or + (0x29–0x2B) to designate to the G1–G3 character sets.

Two of the codes above are 96-character codes, and in the above examples, the character - (0x2D) designates to the G1 character set. This may be replaced with . or / (0x2E or 0x2F) to designate to the G2 or G3 character sets. As mentioned earlier, a 96-character set may not be designated to the G0 set.

There are three special cases for multi-byte codes. The code sequences ESC $ @, ESC $ A, and ESC $ B were all registered before the ISO/IEC 2022 standard was finalized, so must be accepted as synonyms for the sequences ESC $ ( @ through ESC $ ( B to designate to the G0 character set. The latter form may also be used, and may be adapted by changing the ( character to designate to the G1 through G3 character sets.

The standard also defines a way to specify coding systems that do not follow its own structure. Of particular interest, the sequence ESC % G designates the UTF-8 coding system, which does not reserve the range 0x80–0x9F for control characters.

Read more about this topic:  ISO/IEC 2022

Famous quotes containing the words character and/or sets:

    Character repudiates intellect, yet excites it; and character passes into thought, is published so, and then is ashamed before new flashes of moral worth.
    Ralph Waldo Emerson (1803–1882)

    The vain man does not wish so much to be prominent as to feel himself prominent; he therefore disdains none of the expedients for self-deception and self-outwitting. It is not the opinion of others that he sets his heart on, but his opinion of their opinion.
    Friedrich Nietzsche (1844–1900)