Script (Unicode)

In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems.[1] Some scripts support one and only one writing system and language, for example, Armenian. Other scripts support many different writing systems; for example, the Latin script supports English, French, German, Italian, Vietnamese, Latin itself, and several other languages. Some languages make use of multiple alternate writing systems and thus also use several scripts; for example, in Turkish, the Arabic script was used before the 20th century but transitioned to Latin in the early part of the 20th century. For a list of languages supported by each script, see the list of languages by writing system. More or less complementary to scripts are symbols and Unicode control characters.

‎	‎	ᓃ‎	‎	⠿‎
‎	‎	አ‎	文‎	あ‎
ꦏ‎	‎	‎	‎	ழ்‎
‎	ع‎‎	ש‎‎	Д‎	A‎

The unified diacritical characters and unified punctuation characters frequently have the "common" or "inherited" script property. However, the individual scripts often have their own punctuation and diacritics, so that many scripts include not only letters but also diacritic and other marks, punctuation, numerals and even their own idiosyncratic symbols and space characters.

Unicode 14.0 defines 159 separate scripts, including 93 modern scripts and 66 ancient or historic scripts.[2][3] More scripts are in the process for encoding or have been tentatively allocated for encoding in roadmaps.[4]

Definition and classification

When multiple languages make use of the same script, there are frequently some differences, particularly in diacritics and other marks. For example, Swedish and English both use the Latin script. However, Swedish includes the character å (sometimes called a Swedish O), while English has no such character. Nor does English make use of the diacritic combining ring above for any character. In general, the languages sharing the same scripts share many of the same characters. Despite these peripheral differences in the Swedish and English writing systems, they are said to use the same Latin script. Thus, the Unicode abstraction of scripts is a basic organizing technique. The differences among different alphabets or writing systems remain and are supported through Unicode’s flexible scripts, combining marks and collation algorithms.

Script versus writing system

Writing system is sometimes treated as a synonym for "script". However, it also can be used as the specific concrete writing system supported by a script. For example, the Vietnamese writing system is supported by the Latin script. A writing system may also cover more than one script; for example, the Japanese writing system makes use of the Han, Hiragana and Katakana scripts.

Most writing systems can be broadly divided into several categories: logographic, syllabic, alphabetic (or segmental), abugida, abjad and featural; however, all features of any of these may be found in any given writing system in varying proportions, often making it difficult to purely categorize a system. The term complex system is sometimes used to describe those where the admixture makes classification problematic.

Unicode supports all of these types of writing systems through its numerous scripts. Unicode also adds further properties to characters to help differentiate the various characters and the ways they behave within Unicode text-processing algorithms.

Special script property values

In addition to explicit or specific script properties, Unicode uses three special values:[5]

Common: Unicode can assign a character in the UCS to a single script only. However, many characters—those that are not part of a formal natural-language writing system or are unified across many writing systems—may be used in more than one script (for example, currency signs, symbols, numerals and punctuation marks). In these cases Unicode defines them as belonging to the "common" script (ISO 15924 code "Zyyy").
Inherited: Many diacritics and non-spacing combining characters may be applied to characters from more than one script. In these cases Unicode assigns them to the "inherited" script (ISO 15924 code Zinh), which means that they have the same script class as the base character with which they combine, and so in different contexts they may be treated as belonging to different scripts. For example, U+0308 ̈ COMBINING DIAERESIS may combine either with U+0065 e LATIN SMALL LETTER E to create a Latin ë or with U+0435 е CYRILLIC SMALL LETTER IE for the Cyrillic ё. In the former case, it inherits the Latin script of the base character, whereas in the latter case, it inherits the Cyrillic script of the base character.
Unknown: The value of "unknown" script (ISO 15924 code Zzzz) is given to unassigned, private-use, noncharacter, and surrogate code points.

Character categories within scripts

Unicode provides a general category property for each character. So in addition to belonging to a script every character also has a general category. Typically scripts include letter characters including: uppercase letters, lowercase letter and modifier letters. Some characters are considered titlecase letters for a few precomposed ligatures such as ǲ (U+01F2). Such titlecase ligatures are all in the Latin and Greek scripts and are all compatibility characters, and therefore Unicode discourages their use by authors. It is unlikely that new titlecase letters will be added in the future.

Most writing systems do not differentiate between uppercase and lowercase letters. For those scripts all letters are categorized as "other letter" or "modifier letter". Ideographs such as Unihan ideographs are also categorized as "other letters". A few scripts do differentiate between uppercase and lowercase however: Latin, Cyrillic, Greek, Armenian, Georgian, and Deseret. Even for these scripts there are some letters that are neither uppercase nor lowercase.

Scripts can also contain any other general category character such as marks (diacritic and otherwise), numbers (numerals), punctuation, separators (word separators such as spaces), symbols and non-graphical format characters. These are included in a particular script when they are unique to that script. Other such characters are generally unified and included in the punctuation or diacritic blocks. However, the bulk of characters in any script (other than the common and inherited scripts) are letters.

List of scripts in Unicode

Unicode defines over a hundred script names (called "Alias" or "Property value alias"), based on the ISO 15924 list. Unicode uses the "Common" script name for ISO 15924's Zyyy (code for undetermined script), "Inherited" for ISO 15924's Zinh (code for inherited script), and "Unknown" for ISO 15924's Zzzz (code for uncoded script). Not used are, among others, the ISO 15924 script codes: Zsym (Symbols) and Zmth (Mathematical notation). These are considered not to be scripts in Unicode sense.

Scripts in ISO 15924^[a]^[b] and in Unicode^[c]^[d]
ISO 15924			Script in Unicode^[e]
Code	ISO formal name	Directionality	Unicode Alias^[f]	Version	Characters	Notes	Description

Adlm	Adlam	right-to-left script	Adlam	9.0	88		Ch 19.9
Afak	Afaka	varies	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Aghb	Caucasian Albanian	left-to-right	Caucasian Albanian	7.0	53	Ancient/historic	Ch 8.11
Ahom	Ahom, Tai Ahom	left-to-right	Ahom	8.0	65	Ancient/historic	Ch 15.15
Arab	Arabic	right-to-left script	Arabic	1.0	1,365		Ch 9.2
Aran	Arabic (Nastaliq variant)	mixed	ZZ— Typographic variant of Arabic (§ Arab)
Armi	Imperial Aramaic	right-to-left script	Imperial Aramaic	5.2	31	Ancient/historic	Ch 10.4
Armn	Armenian	left-to-right	Armenian	1.0	96		Ch 7.6
Avst	Avestan	right-to-left script	Avestan	5.2	61	Ancient/historic	Ch 10.7
Bali	Balinese	left-to-right	Balinese	5.0	124		Ch 17.3
Bamu	Bamum	left-to-right	Bamum	5.2	657		Ch 19.6
Bass	Bassa Vah	left-to-right	Bassa Vah	7.0	36	Ancient/historic	Ch 19.7
Batk	Batak	left-to-right	Batak	6.0	56		Ch 17.6
Beng	Bengali (Bangla)	left-to-right	Bengali	1.0	96		Ch 12.2
Bhks	Bhaiksuki	left-to-right	Bhaiksuki	9.0	97	Ancient/historic	Ch 14.3
Blis	Blissymbols	varies	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Bopo	Bopomofo	left-to-right	Bopomofo	1.0	77		Ch 18.3
Brah	Brahmi	left-to-right	Brahmi	6.0	115	Ancient/historic	Ch 14.1
Brai	Braille	left-to-right	Braille	3.0	256		Ch 21.1
Bugi	Buginese	left-to-right	Buginese	4.1	30		Ch 17.2
Buhd	Buhid	left-to-right	Buhid	3.2	20		Ch 17.1
Cakm	Chakma	left-to-right	Chakma	6.1	71		Ch 13.11
Cans	Unified Canadian Aboriginal Syllabics	left-to-right	Canadian Aboriginal	3.0	726		Ch 20.2
Cari	Carian	left-to-right, right-to-left script	Carian	5.1	49	Ancient/historic	Ch 8.5
Cham	Cham	left-to-right	Cham	5.1	83		Ch 16.10
Cher	Cherokee	left-to-right	Cherokee	3.0	172		Ch 20.1
Chrs	Chorasmian	right-to-left script, top-to-bottom	Chorasmian	13.0	28	Ancient/historic	Ch 10.8
Cirt	Cirth	varies	ZZ— Not in Unicode
Copt	Coptic	left-to-right	Coptic	1.0	137	Ancient/historic, Disunified from Greek in 4.1	Ch 7.3
Cpmn	Cypro-Minoan	left-to-right	Cypro Minoan	14.0	99	Ancient/historic	Ch 8.4
Cprt	Cypriot syllabary	right-to-left script	Cypriot	4.0	55	Ancient/historic	Ch 8.3
Cyrl	Cyrillic	left-to-right	Cyrillic	1.0	443	Includes typographic variant Old Church Slavonic (§ Cyrs)	Ch 7.4
Cyrs	Cyrillic (Old Church Slavonic variant)	varies	ZZ— Typographic variant of Cyrillic (§ Cyrl); Ancient/historic
Deva	Devanagari (Nagari)	left-to-right	Devanagari	1.0	154		Ch 12.1
Diak	Dives Akuru	left-to-right	Dives Akuru	13.0	72	Ancient/historic	Ch 15.14
Dogr	Dogra	left-to-right	Dogra	11.0	60	Ancient/historic	Ch 15.17
Dsrt	Deseret (Mormon)	left-to-right	Deseret	3.1	80		Ch 20.4
Dupl	Duployan shorthand, Duployan stenography	left-to-right	Duployan	7.0	143		Ch 21.6
Egyd	Egyptian demotic	mixed	ZZ— Not in Unicode
Egyh	Egyptian hieratic	mixed	ZZ— Not in Unicode
Egyp	Egyptian hieroglyphs	right-to-left script	Egyptian Hieroglyphs	5.2	1,080	Ancient/historic	Ch 11.4
Elba	Elbasan	left-to-right	Elbasan	7.0	40	Ancient/historic	Ch 8.10
Elym	Elymaic	right-to-left script	Elymaic	12.0	23	Ancient/historic	Ch 10.9
Ethi	Ethiopic (Geʻez)	left-to-right	Ethiopic	3.0	523		Ch 19.1
Geok	Khutsuri (Asomtavruli and Nuskhuri)	left-to-right	Georgian			Unicode groups "Khutsori", "Asomtavruli" and "Nuskhuri" into 'Georgian' (§ Geok). Also "Mkhedruli" and "Mtavruli" are 'Georgian' (§ Geor)	Ch 7.7
Geor	Georgian (Mkhedruli and Mtavruli)	left-to-right	Georgian	1.0	173	In Unicode, also includes Geok (Nuskhuri)	Ch 7.7
Glag	Glagolitic	left-to-right	Glagolitic	4.1	134	Ancient/historic	Ch 7.5
Gong	Gunjala Gondi	left-to-right	Gunjala Gondi	11.0	63		Ch 13.15
Gonm	Masaram Gondi	left-to-right	Masaram Gondi	10.0	75		Ch 13.14
Goth	Gothic	left-to-right	Gothic	3.1	27	Ancient/historic	Ch 8.9
Gran	Grantha	left-to-right	Grantha	7.0	85	Ancient/historic	Ch 15.13
Grek	Greek	left-to-right	Greek	1.0	518	Directionality sometimes as boustrophedon	Ch 7.2
Gujr	Gujarati	left-to-right	Gujarati	1.0	91		Ch 12.4
Guru	Gurmukhi	left-to-right	Gurmukhi	1.0	80		Ch 12.3
Hanb	Han with Bopomofo (alias for Han + Bopomofo)	mixed	ZZ— See § Hani, § Bopo
Hang	Hangul (Hangŭl, Hangeul)	left-to-right, top-to-bottom	Hangul	1.0	11,739	Hangul syllables relocated in 2.0	Ch 18.6
Hani	Han (Hanzi, Kanji, Hanja)	top-to-bottom, columns right-to-left (historically)	Han	1.0	94,215		Ch 18.1
Hano	Hanunoo (Hanunóo)	left-to-right, bottom-to-top	Hanunoo	3.2	21		Ch 17.1
Hans	Han (Simplified variant)	varies	ZZ— Subset of Han (Hanzi, Kanji, Hanja) (§ Hani)
Hant	Han (Traditional variant)	varies	ZZ— Subset of § Hani
Hatr	Hatran	right-to-left script	Hatran	8.0	26	Ancient/historic	Ch 10.12
Hebr	Hebrew	right-to-left script	Hebrew	1.0	134		Ch 9.1
Hira	Hiragana	top-to-bottom, left-to-right	Hiragana	1.0	380		Ch 18.4
Hluw	Anatolian Hieroglyphs (Luwian Hieroglyphs, Hittite Hieroglyphs)	left-to-right	Anatolian Hieroglyphs	8.0	583	Ancient/historic	Ch 11.6
Hmng	Pahawh Hmong	left-to-right	Pahawh Hmong	7.0	127		Ch 16.11
Hmnp	Nyiakeng Puachue Hmong	left-to-right	Nyiakeng Puachue Hmong	12.0	71		Ch 16.12
Hrkt	Japanese syllabaries (alias for Hiragana + Katakana)	top-to-bottom, left-to-right	Katakana or Hiragana			See § Hira, § Kana	Ch 18.4
Hung	Old Hungarian (Hungarian Runic)	right-to-left script	Old Hungarian	8.0	108	Ancient/historic	Ch 8.8
Inds	Indus (Harappan)	mixed	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Ital	Old Italic (Etruscan, Oscan, etc.)	right-to-left script, left-to-right	Old Italic	3.1	39	Ancient/historic	Ch 8.6
Jamo	Jamo (alias for Jamo subset of Hangul)	varies	ZZ— Subset of § Hang
Java	Javanese	left-to-right	Javanese	5.2	90		Ch 17.4
Jpan	Japanese (alias for Han + Hiragana + Katakana)	varies	ZZ— See § Hani, § Hira and § Kana
Jurc	Jurchen	left-to-right	ZZ— Not in Unicode
Kali	Kayah Li	left-to-right	Kayah Li	5.1	47		Ch 16.9
Kana	Katakana	top-to-bottom, left-to-right	Katakana	1.0	320		Ch 18.4
Kawi	Kawi	left-to-right	ZZ— Not in Unicode
Khar	Kharoshthi	right-to-left script	Kharoshthi	4.1	68	Ancient/historic	Ch 14.2
Khmr	Khmer	left-to-right	Khmer	3.0	146		Ch 16.4
Khoj	Khojki	left-to-right	Khojki	7.0	62	Ancient/historic	Ch 15.7
Kitl	Khitan large script	left-to-right	ZZ— Not in Unicode
Kits	Khitan small script	top-to-bottom	Khitan Small Script	13.0	471	Ancient/historic	Ch 18.12
Knda	Kannada	left-to-right	Kannada	1.0	90		Ch 12.8
Kore	Korean (alias for Hangul + Han)	left-to-right	ZZ— See § Hani, § Hang
Kpel	Kpelle	left-to-right	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Kthi	Kaithi	left-to-right	Kaithi	5.2	68	Ancient/historic	Ch 15.2
Lana	Tai Tham (Lanna)	left-to-right	Tai Tham	5.2	127		Ch 16.7
Laoo	Lao	left-to-right	Lao	1.0	82		Ch 16.2
Latf	Latin (Fraktur variant)	varies	ZZ— Typographic variant of Latin (§ Latn)
Latg	Latin (Gaelic variant)	left-to-right	ZZ— Typographic variant of Latin (§ Latn)
Latn	Latin	left-to-right	Latin	1.0	1,475	See also: Latin script in Unicode	Ch 7.1
Leke	Leke	left-to-right	ZZ— Not in Unicode
Lepc	Lepcha (Róng)	left-to-right	Lepcha	5.1	74		Ch 13.12
Limb	Limbu	left-to-right	Limbu	4.0	68		Ch 13.6
Lina	Linear A	left-to-right	Linear A	7.0	341	Ancient/historic	Ch 8.1
Linb	Linear B	left-to-right	Linear B	4.0	211	Ancient/historic	Ch 8.2
Lisu	Lisu (Fraser)	left-to-right	Lisu	5.2	49		Ch 18.9
Loma	Loma	left-to-right	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Lyci	Lycian	left-to-right	Lycian	5.1	29	Ancient/historic	Ch 8.5
Lydi	Lydian	right-to-left script	Lydian	5.1	27	Ancient/historic	Ch 8.5
Mahj	Mahajani	left-to-right	Mahajani	7.0	39	Ancient/historic	Ch 15.6
Maka	Makasar	left-to-right	Makasar	11.0	25	Ancient/historic	Ch 17.8
Mand	Mandaic, Mandaean	right-to-left script	Mandaic	6.0	29		Ch 9.5
Mani	Manichaean	right-to-left script	Manichaean	7.0	51	Ancient/historic	Ch 10.5
Marc	Marchen	left-to-right	Marchen	9.0	68	Ancient/historic	Ch 14.5
Maya	Mayan hieroglyphs	mixed	ZZ— Not in Unicode
Medf	Medefaidrin (Oberi Okaime, Oberi Ɔkaimɛ)	left-to-right, left-to-right	Medefaidrin	11.0	91		Ch 19.10
Mend	Mende Kikakui	right-to-left script	Mende Kikakui	7.0	213		Ch 19.8
Merc	Meroitic Cursive	right-to-left script	Meroitic Cursive	6.1	90	Ancient/historic	Ch 11.5
Mero	Meroitic Hieroglyphs	right-to-left script	Meroitic Hieroglyphs	6.1	32	Ancient/historic	Ch 11.5
Mlym	Malayalam	left-to-right	Malayalam	1.0	118		Ch 12.9
Modi	Modi, Moḍī	left-to-right	Modi	7.0	79	Ancient/historic	Ch 15.11
Mong	Mongolian	top-to-bottom, left-to-right	Mongolian	3.0	168	Mong includes Clear and Manchu scripts	Ch 13.5
Moon	Moon (Moon code, Moon script, Moon type)	mixed	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Mroo	Mro, Mru	left-to-right	Mro	7.0	43		Ch 13.8
Mtei	Meitei Mayek (Meithei, Meetei)	left-to-right	Meetei Mayek	5.2	79		Ch 13.7
Mult	Multani	left-to-right	Multani	8.0	38	Ancient/historic	Ch 15.9
Mymr	Myanmar (Burmese)	left-to-right	Myanmar	3.0	223		Ch 16.3
Nagm	Nag Mundari	left-to-right	ZZ— Not in Unicode
Nand	Nandinagari	left-to-right	Nandinagari	12.0	65	Ancient/historic	Ch 15.12
Narb	Old North Arabian (Ancient North Arabian)	right-to-left script, right-to-left script	Old North Arabian	7.0	32	Ancient/historic	Ch 10.1
Nbat	Nabataean	right-to-left script	Nabataean	7.0	40	Ancient/historic	Ch 10.10
Newa	Newa, Newar, Newari, Nepāla lipi	left-to-right	Newa	9.0	97		Ch 13.3
Nkdb	Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba)	left-to-right	ZZ— Not in Unicode
Nkgb	Nakhi Geba (na²¹ɕi³³ gʌ²¹ba²¹, 'Na-'Khi ²Ggŏ-¹baw, Nakhi Geba)	left-to-right	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Nkoo	N’Ko	right-to-left script	NKo	5.0	62		Ch 19.4
Nshu	Nüshu	top-to-bottom	Nushu	10.0	397		Ch 18.8
Ogam	Ogham	bottom-to-top, left-to-right	Ogham	3.0	29	Ancient/historic	Ch 8.14
Olck	Ol Chiki (Ol Cemet’, Ol, Santali)	left-to-right	Ol Chiki	5.1	48		Ch 13.10
Orkh	Old Turkic, Orkhon Runic	right-to-left script	Old Turkic	5.2	73	Ancient/historic	Ch 14.8
Orya	Oriya (Odia)	left-to-right	Oriya	1.0	91		Ch 12.5
Osge	Osage	left-to-right	Osage	9.0	72		Ch 20.3
Osma	Osmanya	left-to-right	Osmanya	4.0	40		Ch 19.2
Ougr	Old Uyghur	mixed	Old Uyghur	14.0	26	Ancient/historic	Ch 14.11
Palm	Palmyrene	right-to-left script	Palmyrene	7.0	32	Ancient/historic	Ch 10.11
Pauc	Pau Cin Hau	left-to-right	Pau Cin Hau	7.0	57		Ch 16.13
Pcun	Proto-Cuneiform	left-to-right	ZZ— Not in Unicode
Pelm	Proto-Elamite	left-to-right	ZZ— Not in Unicode
Perm	Old Permic	left-to-right	Old Permic	7.0	43	Ancient/historic	Ch 8.13
Phag	Phags-pa	top-to-bottom	Phags-pa	5.0	56	Ancient/historic	Ch 14.4
Phli	Inscriptional Pahlavi	right-to-left script	Inscriptional Pahlavi	5.2	27	Ancient/historic	Ch 10.6
Phlp	Psalter Pahlavi	right-to-left script	Psalter Pahlavi	7.0	29	Ancient/historic	Ch 10.6
Phlv	Book Pahlavi	mixed	ZZ— Not in Unicode
Phnx	Phoenician	right-to-left script	Phoenician	5.0	29	Ancient/historic^[g]	Ch 10.3
Piqd	Klingon (KLI pIqaD)	left-to-right	ZZ— Rejected for inclusion in Unicode[lower-roman 2][lower-roman 3]
Plrd	Miao (Pollard)	left-to-right	Miao	6.1	149		Ch 18.10
Prti	Inscriptional Parthian	right-to-left script	Inscriptional Parthian	5.2	30	Ancient/historic	Ch 10.6
Psin	Proto-Sinaitic	mixed	ZZ— Not in Unicode
Qaaa-Qabx	Reserved for private use (range)		ZZ— Not in Unicode
Ranj	Ranjana	left-to-right	ZZ— Not in Unicode
Rjng	Rejang (Redjang, Kaganga)	left-to-right	Rejang	5.1	37		Ch 17.5
Rohg	Hanifi Rohingya	right-to-left script	Hanifi Rohingya	11.0	50		Ch 16.14
Roro	Rongorongo	mixed	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Runr	Runic	left-to-right, boustrophedon	Runic	3.0	86	Ancient/historic	Ch 8.7
Samr	Samaritan	right-to-left script, top-to-bottom	Samaritan	5.2	61		Ch 9.4
Sara	Sarati	mixed	ZZ— Not in Unicode
Sarb	Old South Arabian	right-to-left script	Old South Arabian	5.2	32	Ancient/historic	Ch 10.2
Saur	Saurashtra	left-to-right	Saurashtra	5.1	82		Ch 13.13
Sgnw	SignWriting	top-to-bottom	SignWriting	8.0	672		Ch 21.7
Shaw	Shavian (Shaw)	left-to-right	Shavian	4.0	48		Ch 8.15
Shrd	Sharada, Śāradā	left-to-right	Sharada	6.1	96		Ch 15.3
Shui	Shuishu	left-to-right	ZZ— Not in Unicode
Sidd	Siddham, Siddhaṃ, Siddhamātṛkā	left-to-right	Siddham	7.0	92	Ancient/historic	Ch 15.5
Sind	Khudawadi, Sindhi	left-to-right	Khudawadi	7.0	69		Ch 15.8
Sinh	Sinhala	left-to-right	Sinhala	3.0	111		Ch 13.2
Sogd	Sogdian	horizontal and vertical writing in East Asian scripts, top-to-bottom	Sogdian	11.0	42	Ancient/historic	Ch 14.10
Sogo	Old Sogdian	right-to-left script	Old Sogdian	11.0	40	Ancient/historic	Ch 14.9
Sora	Sora Sompeng	left-to-right	Sora Sompeng	6.1	35		Ch 15.16
Soyo	Soyombo	left-to-right	Soyombo	10.0	83	Ancient/historic	Ch 14.7
Sund	Sundanese	left-to-right	Sundanese	5.1	72		Ch 17.7
Sunu	Sunuwar	left-to-right	ZZ— Not in Unicode
Sylo	Syloti Nagri	left-to-right	Syloti Nagri	4.1	45	Ancient/historic	Ch 15.1
Syrc	Syriac	right-to-left script	Syriac	3.0	88	Includes typographic variants Estrangelo (§ Syre), Western (§ Syrj), and Eastern (§ Syrn)	Ch 9.3
Syre	Syriac (Estrangelo variant)	mixed	ZZ— Typographic variant of Syriac (§ Syrc)
Syrj	Syriac (Western variant)	mixed	ZZ— Typographic variant of Syriac (§ Syrc)
Syrn	Syriac (Eastern variant)	mixed	ZZ— Typographic variant of Syriac (§ Syrc)
Tagb	Tagbanwa	left-to-right	Tagbanwa	3.2	18		Ch 17.1
Takr	Takri, Ṭākrī, Ṭāṅkrī	left-to-right	Takri	6.1	68		Ch 15.4
Tale	Tai Le	left-to-right	Tai Le	4.0	35		Ch 16.5
Talu	New Tai Lue	left-to-right	New Tai Lue	4.1	83		Ch 16.6
Taml	Tamil	left-to-right	Tamil	1.0	123		Ch 12.6
Tang	Tangut	top-to-bottom, columns right-to-left, left-to-right	Tangut	9.0	6,914	Ancient/historic	Ch 18.11
Tavt	Tai Viet	left-to-right	Tai Viet	5.2	72		Ch 16.8
Telu	Telugu	left-to-right	Telugu	1.0	100		Ch 12.7
Teng	Tengwar	left-to-right	ZZ— Not in Unicode
Tfng	Tifinagh (Berber)	left-to-right	Tifinagh	4.1	59		Ch 19.3
Tglg	Tagalog (Baybayin, Alibata)	left-to-right	Tagalog	3.2	23		Ch 17.1
Thaa	Thaana	right-to-left script	Thaana	3.0	50		Ch 13.1
Thai	Thai	left-to-right	Thai	1.0	86		Ch 16.1
Tibt	Tibetan	left-to-right	Tibetan	2.0	207	Added in 1.0, removed in 1.1 and reintroduced in 2.0	Ch 13.4
Tirh	Tirhuta	left-to-right	Tirhuta	7.0	82		Ch 15.10
Tnsa	Tangsa	left-to-right	Tangsa	14.0	89		Ch 13.18
Toto	Toto	left-to-right	Toto	14.0	31		Ch 13.17
Ugar	Ugaritic	left-to-right	Ugaritic	4.0	31	Ancient/historic	Ch 11.2
Vaii	Vai	left-to-right	Vai	5.1	300		Ch 19.5
Visp	Visible Speech	left-to-right	ZZ— Not in Unicode
Vith	Vithkuqi	left-to-right	Vithkuqi	14.0	70	Ancient/historic	Ch 8.12
Wara	Warang Citi (Varang Kshiti)	left-to-right	Warang Citi	7.0	84		Ch 13.9
Wcho	Wancho	left-to-right	Wancho	12.0	59		Ch 13.16
Wole	Woleai	mixed	ZZ— Not in Unicode, proposal is explored[lower-roman 1]
Xpeo	Old Persian	left-to-right	Old Persian	4.1	50	Ancient/historic	Ch 11.3
Xsux	Cuneiform, Sumero-Akkadian	left-to-right	Cuneiform	5.0	1,234	Ancient/historic	Ch 11.1
Yezi	Yezidi	right-to-left script	Yezidi	13.0	47	Ancient/historic	Ch 9.6
Yiii	Yi	left-to-right	Yi	3.0	1,220		Ch 18.7
Zanb	Zanabazar Square (Zanabazarin Dörböljin Useg, Xewtee Dörböljin Bicig, Horizontal Square Script)	left-to-right	Zanabazar Square	10.0	72	Ancient/historic	Ch 14.6
Zinh	Code for inherited script		Inherited		657
Zmth	Mathematical notation		ZZ— Not a 'script' in Unicode
Zsym	Symbols		ZZ— Not a 'script' in Unicode
Zsye	Symbols (emoji variant)		ZZ— Not a 'script' in Unicode
Zxxx	Code for unwritten documents		ZZ— Not a 'script' in Unicode
Zyyy	Code for undetermined script		Common		8,252
Zzzz	Code for uncoded script		Unknown		969,350	In Unicode: All other code points
Notes ^ ISO 15924 publications As of 3 December 2021 ^ ISO 15924 Normative text file As of 3 December 2021 ^ ISO 15924 Changes (including Aliases for Unicode; as of 3 December 2021) ^ Unicode version 14.0 ^ Unicode charts ^ Unicode uses the "Property Value Alias" (Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924. An alias script name may be used in a character name: `Palm`, Palmyrene → U+10860 𐡠 PALMYRENE LETTER ALEPH. ^ In Unicode, the Phoenician script is intended for the representation of text in Paleo-Hebrew, Archaic Phoenician, Phoenician, Early Aramaic, Late Phoenician cursive, Phoenician papyri, Siloam Hebrew, Hebrew seals, Ammonite, Moabite, and Punic.[lower-roman 4]
References "Proposed New Scripts". Unicode Consortium. 2018-05-25. Retrieved 2019-09-12. Michael Everson (1997-09-18). "Proposal to encode Klingon in Plane 1 of ISO/IEC 10646-2". The Unicode Consortium (2001-08-14). "Approved Minutes of the UTC 87 / L2 184 Joint Meeting". "Middle East-II, Ancient Scripts" (PDF). 14.0.0. The Unicode Consortiumtitle=Middle-East scripts II. Retrieved 2021-09-15.

Missing scripts in Unicode

With each new version of Unicode, new writing systems are added to the international character code. According to a statement by linguist Dr Deborah Anderson of UC Berkeley, there are over 100 writing systems that have not yet been included in Unicode.

According to a list of the project Missing Scripts by the University of Applied Sciences Mainz, Germany, the ANRT Nancy, France and UC Berkeley, USA, there are 294 known writing systems of mankind according to the current state of research (January 2022). 131 of them have not yet been encoded in Unicode, i.e. cannot yet be used on a computer or mobile phone.

References

"Glossary". unicode.org.
"Unicode Character Database: Scripts". unicode.org.
"Chapter 14: Additional Ancient and Historic Scripts". The Unicode Standard, Version 14.0 (PDF). Mountain View, CA: Unicode, Inc. September 2021. p. 581. ISBN 978-1-936213-29-0.
https://www.unicode.org/roadmaps/ Roadmaps to Unicode
"UAX #24: Unicode Script Property". www.unicode.org.

External links

Script Encoding Initiative, A project at UC Berkeley, USA, working to get more scripts included in the Unicode standard.
The World’s Writing Systems, An overview of all 294 known writing systems, each with a typographic reference glyph and their Unicode status.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[cnote_a_grp_ISO_Unicode] 
ISO 15924 publications As of 3 December 2021

[cnote_b_grp_ISO_list] 
ISO 15924 Normative text file As of 3 December 2021

[cnote_c_grp_ISO_changes] 
ISO 15924 Changes (including Aliases for Unicode; as of 3 December 2021)

[cnote_d_grp_Asof_Unicode_version] 
Unicode version 14.0

[cnote_e_grp_Unicode_charts] 
Unicode charts

[cnote_f_grp_Aliases_for_Unicode] 
Unicode uses the "Property Value Alias" (Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924. An alias script name may be used in a character name: Palm, Palmyrene → U+10860 𐡠 PALMYRENE LETTER ALEPH.

[cnote_g_grp_Scripts] 
In Unicode, the Phoenician script is intended for the representation of text in Paleo-Hebrew, Archaic Phoenician, Phoenician, Early Aramaic, Late Phoenician cursive, Phoenician papyri, Siloam Hebrew, Hebrew seals, Ammonite, Moabite, and Punic.[lower-roman 4]

[uniproposed-6] "Proposed New Scripts". Unicode Consortium. 2018-05-25. Retrieved 2019-09-12.

[7] Michael Everson (1997-09-18). "Proposal to encode Klingon in Plane 1 of ISO/IEC 10646-2".

[8] The Unicode Consortium (2001-08-14). "Approved Minutes of the UTC 87 / L2 184 Joint Meeting".

[9] "Middle East-II, Ancient Scripts" (PDF). 14.0.0. The Unicode Consortiumtitle=Middle-East scripts II. Retrieved 2021-09-15.

[1] "Glossary". unicode.org.

[2] "Unicode Character Database: Scripts". unicode.org.

[3] "Chapter 14: Additional Ancient and Historic Scripts". The Unicode Standard, Version 14.0 (PDF). Mountain View, CA: Unicode, Inc. September 2021. p. 581. ISBN 978-1-936213-29-0.

[4] ttps://www.unicode.org/roadmaps/ Roadmaps to Unicode

[Unicode_script_property-5] "UAX #24: Unicode Script Property". www.unicode.org.

[a]

[b]

[c]

[d]

[e]

[f]

[g]

Writing systems
Index of language articles
Overview	History of writing History of the alphabet Graphemes Scripts in Unicode
Lists	Writing systems Languages by writing system / by first written account Undeciphered writing systems Creators of writing systems
Types	Abjads Alphasyllabaries / Abugidas Alphabets Featural Hieroglyph Ideogrammic Logographic Logophonetic Numeral Pictographic Semi-syllabaries Shorthand Syllabaries
Current examples	Latin alphabet Cyrillic script Chinese characters Arabic alphabet Devanagari Kana Hangul Hebrew alphabet Greek alphabet
Related topics	Pictogram Ideogram