MIK (character set)

MIK (МИК) is an 8-bit Cyrillic code page used with DOS. It is based on the character set used in the Bulgarian Pravetz 16[1] IBM PC compatible system. Kermit calls this character set "BULGARIA-PC" / "bulgaria-pc".[2][3][4] In Bulgaria, it was sometimes incorrectly referred to as code page 856 (which clashes with IBM's definition for a Hebrew code page). This code page is known by FreeDOS as Code page 3021.

This is the most widespread DOS/OEM code page used in Bulgaria, rather than CP 808, CP 855, CP 866 or CP 872.

Almost every DOS program created in Bulgaria, which has Bulgarian strings in it, was using MIK as encoding, and many such programs are still in use.

Character set

Each character is shown with its equivalent Unicode code point and its decimal code point. Only the second half of the table (code points 128255) is shown, the first half (code points 0127) being the same as ASCII.

MIK[5][6][4]
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x title="Alt+128 U+0410 CYRILLIC CAPITAL LETTER A" style="padding:1px;"|А title="Alt+129 U+0411 CYRILLIC CAPITAL LETTER BE" style="padding:1px;"|Б title="Alt+130 U+0412 CYRILLIC CAPITAL LETTER VE" style="padding:1px;"|В title="Alt+131 U+0413 CYRILLIC CAPITAL LETTER GHE" style="padding:1px;"|Г title="Alt+132 U+0414 CYRILLIC CAPITAL LETTER DE" style="padding:1px;"|Д title="Alt+133 U+0415 CYRILLIC CAPITAL LETTER IE" style="padding:1px;"|Е title="Alt+134 U+0416 CYRILLIC CAPITAL LETTER ZHE" style="padding:1px;"|Ж title="Alt+135 U+0417 CYRILLIC CAPITAL LETTER ZE" style="padding:1px;"|З title="Alt+136 U+0418 CYRILLIC CAPITAL LETTER I" style="padding:1px;"|И title="Alt+137 U+0419 CYRILLIC CAPITAL LETTER SHORT I" style="padding:1px;"|Й title="Alt+138 U+041A CYRILLIC CAPITAL LETTER KA" style="padding:1px;"|К title="Alt+139 U+041B CYRILLIC CAPITAL LETTER EL" style="padding:1px;"|Л title="Alt+140 U+041C CYRILLIC CAPITAL LETTER EM" style="padding:1px;"|М title="Alt+141 U+041D CYRILLIC CAPITAL LETTER EN" style="padding:1px;"|Н title="Alt+142 U+041E CYRILLIC CAPITAL LETTER O" style="padding:1px;"|О title="Alt+143 U+041F CYRILLIC CAPITAL LETTER PE" style="padding:1px;"|П
9x title="Alt+144 U+0420 CYRILLIC CAPITAL LETTER ER" style="padding:1px;"|Р title="Alt+145 U+0421 CYRILLIC CAPITAL LETTER ES" style="padding:1px;"|С title="Alt+146 U+0422 CYRILLIC CAPITAL LETTER TE" style="padding:1px;"|Т title="Alt+147 U+0423 CYRILLIC CAPITAL LETTER U" style="padding:1px;"|У title="Alt+148 U+0424 CYRILLIC CAPITAL LETTER EF" style="padding:1px;"|Ф title="Alt+149 U+0425 CYRILLIC CAPITAL LETTER HA" style="padding:1px;"|Х title="Alt+150 U+0426 CYRILLIC CAPITAL LETTER TSE" style="padding:1px;"|Ц title="Alt+151 U+0427 CYRILLIC CAPITAL LETTER CHE" style="padding:1px;"|Ч title="Alt+152 U+0428 CYRILLIC CAPITAL LETTER SHA" style="padding:1px;"|Ш title="Alt+153 U+0429 CYRILLIC CAPITAL LETTER SHCHA" style="padding:1px;"|Щ title="Alt+154 U+042A CYRILLIC CAPITAL LETTER HARD SIGN" style="padding:1px;"|Ъ title="Alt+155 U+042B CYRILLIC CAPITAL LETTER YERU" style="padding:1px;"|Ы title="Alt+156 U+042C CYRILLIC CAPITAL LETTER SOFT SIGN" style="padding:1px;"|Ь title="Alt+157 U+042D CYRILLIC CAPITAL LETTER E" style="padding:1px;"|Э title="Alt+158 U+042E CYRILLIC CAPITAL LETTER YU" style="padding:1px;"|Ю title="Alt+159 U+042F CYRILLIC CAPITAL LETTER YA" style="padding:1px;"|Я
Ax title="Alt+160 U+0430 CYRILLIC SMALL LETTER A" style="padding:1px;"|а title="Alt+161 U+0431 CYRILLIC SMALL LETTER BE" style="padding:1px;"|б title="Alt+162 U+0432 CYRILLIC SMALL LETTER VE" style="padding:1px;"|в title="Alt+163 U+0433 CYRILLIC SMALL LETTER GHE" style="padding:1px;"|г title="Alt+164 U+0434 CYRILLIC SMALL LETTER DE" style="padding:1px;"|д title="Alt+165 U+0435 CYRILLIC SMALL LETTER IE" style="padding:1px;"|е title="Alt+166 U+0436 CYRILLIC SMALL LETTER ZHE" style="padding:1px;"|ж title="Alt+167 U+0437 CYRILLIC SMALL LETTER ZE" style="padding:1px;"|з title="Alt+168 U+0438 CYRILLIC SMALL LETTER I" style="padding:1px;"|и title="Alt+169 U+0439 CYRILLIC SMALL LETTER SHORT I" style="padding:1px;"|й title="Alt+170 U+043A CYRILLIC SMALL LETTER KA" style="padding:1px;"|к title="Alt+171 U+043B CYRILLIC SMALL LETTER EL" style="padding:1px;"|л title="Alt+172 U+043C CYRILLIC SMALL LETTER EM" style="padding:1px;"|м title="Alt+173 U+043D CYRILLIC SMALL LETTER EN" style="padding:1px;"|н title="Alt+174 U+043E CYRILLIC SMALL LETTER O" style="padding:1px;"|о title="Alt+175 U+043F CYRILLIC SMALL LETTER PE" style="padding:1px;"|п
Bx title="Alt+176 U+0440 CYRILLIC SMALL LETTER ER" style="padding:1px;"|р title="Alt+177 U+0441 CYRILLIC SMALL LETTER ES" style="padding:1px;"|с title="Alt+178 U+0442 CYRILLIC SMALL LETTER TE" style="padding:1px;"|т title="Alt+179 U+0443 CYRILLIC SMALL LETTER U" style="padding:1px;"|у title="Alt+180 U+0444 CYRILLIC SMALL LETTER EF" style="padding:1px;"|ф title="Alt+181 U+0445 CYRILLIC SMALL LETTER HA" style="padding:1px;"|х title="Alt+182 U+0446 CYRILLIC SMALL LETTER TSE" style="padding:1px;"|ц title="Alt+183 U+0447 CYRILLIC SMALL LETTER CHE" style="padding:1px;"|ч title="Alt+184 U+0448 CYRILLIC SMALL LETTER SHA" style="padding:1px;"|ш title="Alt+185 U+0449 CYRILLIC SMALL LETTER SHCHA" style="padding:1px;"|щ title="Alt+186 U+044A CYRILLIC SMALL LETTER HARD SIGN" style="padding:1px;"|ъ title="Alt+187 U+044B CYRILLIC SMALL LETTER YERU" style="padding:1px;"|ы title="Alt+188 U+044C CYRILLIC SMALL LETTER SOFT SIGN" style="padding:1px;"|ь title="Alt+189 U+044D CYRILLIC SMALL LETTER E" style="padding:1px;"|э title="Alt+190 U+044E CYRILLIC SMALL LETTER YU" style="padding:1px;"|ю title="Alt+191 U+044F CYRILLIC SMALL LETTER YA" style="padding:1px;"|я
Cx title="Alt+192 U+2514 BOX DRAWINGS LIGHT UP AND RIGHT" style="padding:1px;"| title="Alt+193 U+2534 BOX DRAWINGS LIGHT UP AND HORIZONTAL" style="padding:1px;"| title="Alt+194 U+252C BOX DRAWINGS LIGHT DOWN AND HORIZONTAL" style="padding:1px;"| title="Alt+195 U+251C BOX DRAWINGS LIGHT VERTICAL AND RIGHT" style="padding:1px;"| title="Alt+196 U+2500 BOX DRAWINGS LIGHT HORIZONTAL" style="padding:1px;"| title="Alt+197 U+253C BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL" style="padding:1px;"| title="Alt+198 U+2563 BOX DRAWINGS DOUBLE VERTICAL AND LEFT" style="padding:1px;"| title="Alt+199 U+2551 BOX DRAWINGS DOUBLE VERTICAL" style="padding:1px;"| title="Alt+200 U+255A BOX DRAWINGS DOUBLE UP AND RIGHT" style="padding:1px;"| title="Alt+201 U+2554 BOX DRAWINGS DOUBLE DOWN AND RIGHT" style="padding:1px;"| title="Alt+202 U+2569 BOX DRAWINGS DOUBLE UP AND HORIZONTAL" style="padding:1px;"| title="Alt+203 U+2566 BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL" style="padding:1px;"| title="Alt+204 U+2560 BOX DRAWINGS DOUBLE VERTICAL AND RIGHT" style="padding:1px;"| title="Alt+205 U+2550 BOX DRAWINGS DOUBLE HORIZONTAL" style="padding:1px;"| title="Alt+206 U+256C BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL" style="padding:1px;"| title="Alt+207 U+2510 BOX DRAWINGS LIGHT DOWN AND LEFT" style="padding:1px;"|
Dx title="Alt+208 U+2591 LIGHT SHADE" style="padding:1px;"| title="Alt+209 U+2592 MEDIUM SHADE" style="padding:1px;"| title="Alt+210 U+2593 DARK SHADE" style="padding:1px;"| title="Alt+211 U+2502 BOX DRAWINGS LIGHT VERTICAL" style="padding:1px;"| title="Alt+212 U+2524 BOX DRAWINGS LIGHT VERTICAL AND LEFT" style="padding:1px;"| title="Alt+213 U+2116 NUMERO SIGN" style="padding:1px;"| title="Alt+214 U+00A7 SECTION SIGN" style="padding:1px;"|§ title="Alt+215 U+2557 BOX DRAWINGS DOUBLE DOWN AND LEFT" style="padding:1px;"| title="Alt+216 U+255D BOX DRAWINGS DOUBLE UP AND LEFT" style="padding:1px;"| title="Alt+217 U+2518 BOX DRAWINGS LIGHT UP AND LEFT" style="padding:1px;"| title="Alt+218 U+250C BOX DRAWINGS LIGHT DOWN AND RIGHT" style="padding:1px;"| title="Alt+219 U+2588 FULL BLOCK" style="padding:1px;"| title="Alt+220 U+2584 LOWER HALF BLOCK" style="padding:1px;"| title="Alt+221 U+258C LEFT HALF BLOCK" style="padding:1px;"| title="Alt+222 U+2590 RIGHT HALF BLOCK" style="padding:1px;"| title="Alt+223 U+2580 UPPER HALF BLOCK" style="padding:1px;"|
Ex title="Alt+224 U+03B1 GREEK SMALL LETTER ALPHA" style="padding:1px;"|α title="Alt+225 U+00DF LATIN SMALL LETTER SHARP S" style="padding:1px;"|ß[nb 1] title="Alt+226 U+0393 GREEK CAPITAL LETTER GAMMA" style="padding:1px;"|Γ title="Alt+227 U+03C0 GREEK SMALL LETTER PI" style="padding:1px;"|π title="Alt+228 U+03A3 GREEK CAPITAL LETTER SIGMA" style="padding:1px;"|Σ[nb 2] title="Alt+229 U+03C3 GREEK SMALL LETTER SIGMA" style="padding:1px;"|σ title="Alt+230 U+00B5 MICRO SIGN" style="padding:1px;"|µ[nb 3] title="Alt+231 U+03C4 GREEK SMALL LETTER TAU" style="padding:1px;"|τ title="Alt+232 U+03A6 GREEK CAPITAL LETTER PHI" style="padding:1px;"|Φ title="Alt+233 U+0398 GREEK CAPITAL LETTER THETA" style="padding:1px;"|Θ title="Alt+234 U+03A9 GREEK CAPITAL LETTER OMEGA" style="padding:1px;"|Ω[nb 4] title="Alt+235 U+03B4 GREEK SMALL LETTER DELTA" style="padding:1px;"|δ title="Alt+236 U+221E INFINITY" style="padding:1px;"| title="Alt+237 U+03C6 GREEK SMALL LETTER PHI" style="padding:1px;"|φ title="Alt+238 U+03B5 GREEK SMALL LETTER EPSILON" style="padding:1px;"|ε[nb 5] title="Alt+239 U+2229 INTERSECTION" style="padding:1px;"|
Fx title="Alt+240 U+2261 IDENTICAL TO" style="padding:1px;"| title="Alt+241 U+00B1 PLUS-MINUS SIGN" style="padding:1px;"|± title="Alt+242 U+2265 GREATER-THAN OR EQUAL TO" style="padding:1px;"| title="Alt+243 U+2264 LESS-THAN OR EQUAL TO" style="padding:1px;"| title="Alt+244 U+2320 TOP HALF INTEGRAL" style="padding:1px;"| title="Alt+245 U+2321 BOTTOM HALF INTEGRAL" style="padding:1px;"| title="Alt+246 U+00F7 DIVISION SIGN" style="padding:1px;"|÷ title="Alt+247 U+2248 ALMOST EQUAL TO" style="padding:1px;"| title="Alt+248 U+00B0 DEGREE SIGN" style="padding:1px;"|° title="Alt+249 U+2219 BULLET OPERATOR" style="padding:1px;"| title="Alt+250 U+00B7 MIDDLE DOT" style="padding:1px;"|· title="Alt+251 U+221A SQUARE ROOT" style="padding:1px;"| title="Alt+252 U+207F SUPERSCRIPT LATIN SMALL LETTER N" style="padding:1px;"| title="Alt+253 U+00B2 SUPERSCRIPT TWO" style="padding:1px;"|² title="Alt+254 U+25A0 BLACK SQUARE" style="padding:1px;"| title="Alt+255 U+00A0 NO-BREAK SPACE" style="font-size:75%;padding:1px;"|NBSP

Notes for implementors of mapping tables to Unicode

Implementors of mapping tables to Unicode should note that the MIK Code page unifies some characters:

  1. 0xE1 is both the German sharp S (U+00DF, ß) and the Greek lowercase beta (U+03B2, β);
  2. 0xE4 is both the n-ary summation sign (U+2211, ∑) and the Greek uppercase sigma (U+03A3, Σ);
  3. 0xE6 is both the micro sign (U+00B5, µ) and the Greek lowercase mu (U+03BC, μ);
  4. 0xEA is both the Ohm sign (U+2126, Ω) and the Greek uppercase omega (U+03A9, Ω);
  5. 0xEE is both the element-of sign (U+2208, ∈) and the Greek lowercase epsilon (U+03B5, ε)!

Binary character manipulations

The MIK code page maintains in alphabetical order all Cyrillic letters which enables very easy character manipulation in binary form:

10xx xxxx - is a Cyrillic Letter

100x xxxx - is an Upper-case Cyrillic Letter

101x xxxx - is a Lower-case Cyrillic Letter

In such case testing and character manipulating functions as:

IsAlpha(), IsUpper(), IsLower(), ToUpper() and ToLower(),

are bit operations and sorting is by simple comparison of character values.

See also

References

  1. "Pravetz 16". Archived from the original on 2016-12-06. Retrieved 2016-12-06.
  2. da Cruz, Frank (2010-04-02). "Kermit and MIME Character-Set Names". The Kermit Project. Columbia University, New York, USA. Archived from the original on 2016-12-03. Retrieved 2016-12-02.
  3. "Kermit 95 - Cyrillic Character Sets".
  4. http://www.columbia.edu/kermit/ftp/charsets/cp856.txt
  5. Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.
  6. Hohlov, Yu. E. "Cyrillic Information Representation in Electronic Form - Character Set (Code Page) Tables". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.