Character (computing)

A char in the C programming language and C++ programming language is a fixed-size byte entity.

It was specified to be large enough to store a character value from ASCII or other encodings. By far the most common size is 8 bits, allowing 256 different values to be stored in a char. This is twice the size of ASCII and allowed a large number of 8-bit extended character sets to be used, but led to impossibility to directly store characters from Unicode and other modern sets in a single char. Instead larger storage units such as wchar t, or variable-length encodings such as UTF-8, are used. char16_t and char32_t have been added to C++ language to store 16 bits and 32 bits code units.

Unfortunately, the fact that a character was stored in byte led to the two terms being used interchangeably in most documentation. This often makes the documentation confusing or misleading when multibyte encodings such as UTF-8 are used, and has led to inefficient and incorrect implementations of string manipulation functions.

POSIX defines "character" as a sequence of one or more bytes representing a single graphic symbol or control code, and attempts to use "byte" when referring to char data. However it defines Character Array as an array of elements of type char.

Read more about this topic: Character (computing)

Character (computing) - Char