|
The character encryption consists of the code that pairs a set of characters (representations of graphemes or grapheme-rather units, like can pop up within an alphabet or syllabary for the communication of a natural language) with the placed of something else, like numbers or electrical pulses, sequentially to help a storage of text in computers and a transmission of text across telecommunication networks. Most common examples include Morse code, which encodes letters of the Latin alphabet as series of long & short depressions of the telegraph key; and ASCII, which encodes letters, numerals, & more symbols, each when integers and when Seven-bit binary versions of those whole number.
Character repertoire
Around the select few contexts, especially storage & communication, it add up to distinguish a character repertoire (the fully placed of abstract characters that the technique supports) from either a coded character placed or even character encryption (which specifies training represent characters from either that placed utilizing the total of whole number codes).
Within earliest times of computing, a introduction of character repertoires like ASCII (1963) & EBCDIC (1964) began the run of standardisation. the limitations of such sets presently became apparent, & a total of ad-hoc methods developed to extend the babies. A require to trend lines multiple writing systems, including the CJK family of East Asian scripts, required trend lines for a far pack of characters & demanded the orderly approach to character encryption like than the former ad hoc approaches.
E.g., a fully repertoire of Unicode encompasses over 100,000 characters. Every one characters has the unique whole number code in the range Cipher to hexadecimal 10FFFF (a little ended Single.One million, and then non totally whole number therein range represent coded characters). More most common repertoires include ASCII & ISO 8859-1, which mirror exactly the number 1 128 & 256 coded characters of Unicode severally.
Encoding forms and encoding schemes
Computer man of science another time overload a term character encryption to mean too how else the specific sequence of bits represent characters. This involves an encryption form which specifies the conversion of the whole number code into a series of whole number code values that help storage within the body that utilizes fixed bit breadth. E.g., whole number greater than 65535 ( hex FFFF) will non harmonise Sixteen bits, and so a UTF-16 encoding form mandates representation of these whole number as a foster pair of whole number, every less than 65536 & non assigned to characters (e.g., hex 10000 becomes the pair D800 DC00). An encryption scheme so converts code values to bit sequences, attentively given to items rather platform-dependent byte order issues (for example, D800 DC00 will get 00 D8 00 DC in an Intel x86 architecture). The character placed or even character map or even code page shortcuts this process by directly mapping abstract characters to specific bit system. [http://www.unicode.org/reports/tr17/ Unicode Technical Report #17] explains this nomenclature within depth & will bring more examples.
Since virtually all applications apply merely the little subset of Unicode, encoding schemes (prefer UTF-8 and UTF-16) and character maps (such as ASCII) provide effective ways to represent Unicode characters inside computer storage or communications by using short binary words. Occasionally one elementary encryption apply data compression techniques to represent the big repertoire by using a little total of codes.
Popular character encodings
ISO 646
ASCII
EBCDIC
ISO 8859:
ISO 8859-1, ISO 8859-2, ISO 8859-3, ISO 8859-4, ISO 8859-5, ISO 8859-6, ISO 8859-7, ISO 8859-8, ISO 8859-9, ISO 8859-10, ISO 8859-11, ISO 8859-13, ISO 8859-14, ISO 8859-15, ISO 8859-16
DOS character sets:
CP437, CP737, CP850, CP852, CP855, CP857, CP858, CP860, CP861, CP863, CP865, CP866, CP869
Windows character sets:
Windows-1250
Windows-1251 for Cyrillic alphabets
Windows-1252
Windows-1253
Windows-1254
Windows-1255 for Hebrew
Windows-1256 for Arabic
Windows-1257
Windows-1258 for Vietnamese
KOI8-R, KOI8-U, KOI7
ISCII
VISCII
Big5
HKSCS
Guobiao
GB2312
GB18030
ISO 2022, Shift-JIS, EUC
Unicode (and subsets thereof, like a Sixteen-bit 'Basic Multilingual Plane'). View UTF-8
|