TACTIS, tactis - A character encoding system (codeset) for
Thai.
The TACTIS (Thai API Consortium/Thai Industrial Standard)
codeset consists of the following two character sets:
ASCII (ISO 646-1983) TIS 620-2533
These characters are 8-bit coded, ranging from 00
to FF.
ASCII Characters [Toc] [Back]
In the TACTIS codeset, all ASCII characters are implemented
in the form of single-byte, 7-bit characters; that
is, the most significant bit (MSB) of ASCII characters is
always set off. For more information, refer to ascii(5).
TIS 620-2533 Characters [Toc] [Back]
The TIS 620-2533 character set includes 89 characters that
are categorized as follows: Consonants: 44 Vowels: 18
total (5 leading vowels, 6 following vowels, 2 below vowels,
and 5 above vowels) Tone marks: 4 Diacritics: 5 (4
above diacritics and 1 below diacritic) NonComposibles: 8
(1 nobreak space, 10 Thai digits, and 6 Thai special characters)
Note
Thai digits are not recognized by the isdigit(),
iswdigit(), isxdigit(), and iswxdigit(), isalnum(),
and iswalnum() functions. Many applications make
assumptions about how a digit character can be converted
to its numeric equivalent. Changing the
functions to recognize Thai digits would break
these applications.
Code Ranges in the TACTIS Codeset [Toc] [Back]
In the TACTIS codeset, the most significant bit (MSB) of a
byte is set on in codes for TIS 620-2533 characters. This
differentiates TIS 620-2533 character code from ASCII
character code.
Following are the code ranges for each of the five categories
of Thai characters in the codeset:
--------------------------------------------------
Category Code Range (hex)
--------------------------------------------------
Consonants A1 to CE
Leading vowels E0 to E4
Normal following vowels D0, D2, D3, E5
Special following vowels C4, C6
Below vowels D8, D9
Above vowels D1, D4 to D7
Tone marks E8 to EB
Above diacritics E7, EC to EE
Below diacritics DA
Nobreak space A0
Thai digits F0 to F9
Thai special characters CF, DF, E6, EF, FA, FB
--------------------------------------------------
In TACTIS, the hexadecimal code points of TIS 620-2533
characters are as follows:
A0 NO-BREAK SPACE C0 PO SAMPOW E0 SARA E
A1 KO KAI C1 MO MA E1 SARA AE
A2 KHO KHAI C2 YO YAK E2 SARA O
A3 KHO KHUAT C3 RO RUA E3 SARA AI
MAIMUAN
A4 KHO KHWAI C4 RU E4 SARA AI
MAIMALAI
A5 KHO KHON C5 LO LING E5 LAKKHANGYAO
A6 KHO RAKHANG C6 LU E6 MAIYAMOK
A7 NGO NGU C7 WO WAEN E7 MAITAIKHU
A8 CHO CHAN C8 SO SALA E8 MAI EK
A9 CHO CHING C9 SO RUSI E9 MAI THO
AA CHO CHANG CA SO SUA EA MAI TRIE
AB SO SO CB HO HEEP EB MAI CHATTAWA
AC CHO CHOE CC LO CHULA EC THANTHAKHAT
AD YO YING CD O ANG ED NIKHANHIT
AE DO CHADA CE HO NOKHUK EE YAMAKKAN
AF TO PATAK CF PAIYANNOI EF FONGMAN
B0 THO THO THAN D0 SARA A F0 THAI ZERO
B1 THO NANGMONTHO D1 MAI HAN-AKAT F1 THAI ONE
B2 THO PHOO THAO D2 SARA AA F2 THAI TWO
B3 NOR NANE D3 SARA AM F3 THAI THREE
B4 DOR DEK D4 SARA E F4 THAI FOUR
B5 TO TAO D5 SARA EE F5 THAI FIVE
B6 THO THUNG D6 SARA UR F6 THAI SIX
B7 THO THAHAN D7 SARA UUR F7 THAI SEVEN
B8 THO THONG D8 SARA U F8 THAI EIGHT
B9 NO NU D9 SARA UU F9 THAI NINE
BA BO BAIMAI DA PHINTHU FA ANGKHANKHU
BB PO PLA DB FB KHOMUT
BC PHO PERNG DD FC
BD FO FA DE FD
BE PO PAN DF BAHT FE
BF FO FAN FF
For more information on Thai characters, refer to
Wototo(5).
Fonts for TIS 620 2533 [Toc] [Back]
The operating system provides both screen and printer
fonts for TIS 620 2533 characters.
The following bitmap fonts reflect various sizes and typefaces
for 75dpi and 100dpi display devices:
-adecw-screen-medium-r-normal--14-140-75-75-p-70-tis620.2533-1
-adecw-screen-mediumr-normal--18-180-75-75-p-80-tis620.2533-1
-adecw-screenmedium-r-normal--24-240-75-75-p-120-tis620.2533-1
-adecwscreen-medium-r-normal--14-140-100-100-p-70-tis620.2533-1
-adecw-screen-medium-r-normal--18-180-100-100-p-80-tis620.2533-1
-adecw-screenmedium-r-normal--24-240-100-100-p-120-tis620.2533-1
The operating system provides the following Thai fonts for
PostScript printers: AngsanaUPC-Bold AngsanaUPC-BoldItalic
AngsanaUPC-Italic AngsanaUPC-Light CordiaUPC-Bold CordiaUPC-BoldItalic
CordiaUPC-Italic CordiaUPC-Light
EucrosiaUPC-Bold EucrosiaUPC-BoldItalic EucrosiaUPC-Italic
EucrosiaUPC-Light FreesiaUPC-Bold FreesiaUPC-BoldItalic
FreesiaUPC-Italic FreesiaUPC-Light IrisUPC-Bold IrisUPCBoldItalic
IrisUPC-Italic IrisUPC-Light JasmineUPC-Bold
JasmineUPC-BoldItalic JasmineUPC-Italic JasmineUPC-Light
KodchiangUPC-Bold KodchiangUPC-BoldItalic KodchiangUPCItalic
KodchiangUPC-Light LilyUPC-Bold LilyUPC-BoldItalic
LilyUPC-Italic LilyUPC-Light WaterlilyUPC-Bold
WaterlilyUPC-BoldItalic WaterlilyUPC-Italic WaterlilyUPCLight
YuccaUPC-Bold YuccaUPC-BoldItalic YuccaUPC-Italic
YuccaUPC-Light
For general information on printing Asian language text,
refer to i18n_printing(5).
Codeset Conversion [Toc] [Back]
The following converter pairs are available for converting
data between TACTIS and other encoding formats. Refer to
iconv_intro(5) for an introduction to codeset conversion.
For more information about the other codeset for which
TACTIS is the input or output, see the reference page
specified in the list item. cp874_TACTIS, TACTIS_cp874
Converting from and to PC code page 874:
code_page(5) UTF-16_TACTIS, TACTIS_UTF-16
Converting from and to UTF-16: Unicode(5)
UCS-4_TACTIS, TACTIS_UCS-4
Converting from and to UCS-4: Unicode(5) UTF-8_TACTIS,
TACTIS_UTF-8
Converting from and to UTF-8: Unicode(5)
Commands: locale(1)
Others: code_page(5), ascii(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), Thai(5), Unicode(5), Wototo(5)
TACTIS(5)
[ Back ] |