*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->Tru64 Unix man pages -> wototo (5)              
Title
Content
Arch
Section
 

Wototo(5)

Contents


NAME    [Toc]    [Back]

       Wototo,  wototo  - Introduction to the Thai language standard

DESCRIPTION    [Toc]    [Back]

       Wototo  is  the  Thai  language  software   standard.   It
       describes Thai characters and their classifications.  This
       standard also describes the methods used to input and output
 Thai characters.

   Thai Character Sets    [Toc]    [Back]
       The  following two character sets are defined for the Thai
       language: Basic character set Auxiliary character set

       In the basic character set, characters are 8-bit coded and
       have  values from 0 to 255. Character values correspond to
       the characters defined in standards as follows:  Values  0
       to 7F correspond to characters from the ISO 646-1983 standard.
  Values A1 to FB (except for DB, DD and  DE)  correspond
  to  characters  from  the  TIS  620-2533  standard.
       Remaining values are reserved for future use.

       The encoded form of the basic character set is called  the
       the  TACTIS  codeset,  which is discussed in the TACTIS(5)
       reference page.

       Characters in the auxiliary character  set  use  the  code
       values  32 to 126 and 161 to 254 only. The Wototo standard
       specifies that implementations provide at least one auxiliary
 character set.

   Character Classification    [Toc]    [Back]
       In  the TACTIS codeset, characters are organized into different
 classes.   This  classification  is  done  only  to
       facilitate processing is not related to Thai linguistic or
       grammatical rules.  The  codeset  contains  the  following
       character classes: Nondisplayable characters that are used
       for controlling output or data communication.  The  sixtysix
  control character values are: 00 to 1F, 7F, 80 to 9F,
       and FF.  The Thai consonants as defined in  TIS  620-2533.
       The  five  leading vowels as defined in TIS 620-2533.  The
       six following vowels as defined in TIS 620-2533.  The  two
       below  vowels  as defined in TIS 620-2533.  The five above
       vowels as defined in TIS 620-2533.  The four tone marks as
       defined  in  TIS  620-2533.   The four above diacritics as
       defined in TIS 620-2533.  The below diacritic  as  defined
       in  TIS  620-2533.   Those characters that do not fit into
       preceding five character classes. This group includes  119
       characters  that  users  cannot compose with above vowels,
       below vowels, tone marks, and above and below  diacritics.
       Non-composible  characters  are divided into the following
       seven groups: Graphic Characters

              The 94  graphic  defined  in  ISO  646-1983.  These
              include: 52 English alphabetic characters 10 digits
              32 special characters whose values are 21 to 2F, 3A
              to 3F, and 7B to 7E Space

              Character code value is 20.  Nobreak space

              Character code value is A0.  Thai digits

              The  10  Thai  digits  as  defined in TIS 620-2533.
              Thai special characters

              The 6 Thai special characters  as  defined  in  TIS
              620-2533.  Reserved code points

              6 code points reserved for future use.

       To  better describe Thai input and output methods, characters
 in the classes FV, BV, AV, and AD are further divided
       into  subclasses.  The  following list describes character
       classes and subclasses by the number of characters in  the
       class and their encoded values: Number: 66

              Values: 00 to 1F, 7F, 80 to 9F, and FF Number: 119

              Values:

              20 to 7E (ISO 646-1983 character codes)

              A0,  CF,  DC, DF, E6, EF, F0 to F9, FA, and FB (TIS
              620-2533 character codes)

              DB, DD, DE FC, FD, and FE  (Reserved  code  points)
              Number: 44

              Values: A1 to C3, C5, and C7 to CE Number: 5

              Values: E0, E1, E2, E3, and E4 Number: 3

              Values: D0, D2, and D3 Number: 1

              Value: E5 Number: 2

              Values: C4 and C6

              These  two characters also behave as leading vowels
              (LV) in the character sequence LV+CONS.  Number: 1

              Value: D8 Number: 1

              Value: D9 Number: 1

              Value: DA Number: 4

              Values: E8, E9, EA, and EB Number: 2

              Values: ED and EC Number: 1

              Value: E7 Number: 1

              Value: EE Number: 1

              Value: D4 Number: 2

              Values: D1 and D6 Number: 2

              Values: D5 and D7









   Character Levels    [Toc]    [Back]
       Thai characters are classified according to different display
  levels  (relative  to  baseline and nondisplayable).
       Classification by display levels facilitates the character
       input  procedures. There are five character classification
       levels. Four levels include displayable characters and one
       level  includes  nondisplayable  characters,  as  follows:
       Nondisplayable level

              Includes all control characters in the CTRL  class.
              Base level

              Includes  all  characters in the NON, CONS, FV, and
              LV classes. Characters at this level are  drawn  on
              baseline.  Above level

              Includes  all  characters in the AD3, AV1, AV2, and
              AV3 classes. Characters at  this  level  are  drawn
              immediately above final consonants.  Below level

              Includes  all  characters  in  the BV1, BV2, and BD
              classes. Characters at this level are drawn immediately
 below final consonants.  Top level

              Includes  all  characters in the TONE, AD1, and AD2
              classes. Characters at this level are drawn on  top
              of  the  characters  at  the  above level. If above
              level characters do not exist, top level characters
              are  drawn  at  the above level. Characters at this
              level also indicate the end of character cells.

       The standard specifies that the properties of Thai characters
 can be tested by using the following functions.

                                  Note

       These functions are not implemented in Tru64 UNIX.

       Determines  the  character  level class that the character
       belongs to and returns the numeric value 0, 1, 2, 3, or 4.
       These  return  values  can be represented by the constants
       NONDISP,  TOP,  ABOVE,  BASE,  or   BELOW,   respectively.
       Returns  TRUE  if a character is alphabetic.  Returns TRUE
       if a character is either alphabetic or a  digit.   Returns
       TRUE  if  a  character belongs to the CTRL class.  Returns
       TRUE if the character is a digit.   Returns  TRUE  if  the
       character is not in the NONDISP level class.  Returns TRUE
       if the character is an English lowercase letter (a to  z).
       Returns TRUE if the character is an English uppercase letter
 (A to Z).  Returns TRUE if a character is not  in  the
       NONDISP  level  class.  Returns TRUE if the character is a
       space, formfeed, newline, return, tab,  or  vertical  tab.
       Returns  TRUE if the character is a hexadecimal digit 0 to
       9, A to F, or a to f. (Thai digits are excluded.)

   Thai Input Methods    [Toc]    [Back]
       The input method for Thai characters directly maps characters
 to keys, as for English. Thai character sequences are
       entered character by character and display  from  left  to
       right, regardless of whether the sequence includes forward
       characters (characters in the NON, CONS, LV, FV1, FV2, FV3
       classes)  or  dead  characters  (characters  in  all other
       classes). However, the following basic rules apply to  the
       character  input  sequence:  Every display cell must begin
       with a character on the baseline (in the BASE  class).   A
       character in the BASE class that is also in the CONS class
       may be followed by an above vowel, a below vowel,  a  tone
       mark, a below diacritic, or an above diacritic.

       For  more detailed rules about input sequence rules, refer
       to the Draft Industrial Standard - Thai Language  Software
       Standard WTT2.0 (Part 2: Thai Input and Output Methods)

SEE ALSO    [Toc]    [Back]

      
      
       Commands: locale(1)

       Others:  i18n_intro(5),  i18n_printing(5),  l10n_intro(5),
       TACTIS(5), Thai(5)



                                                        Wototo(5)
[ Back ]
 Similar pages
Name OS Title
Thai Tru64 Introduction to Thai language support
thai Tru64 Introduction to Thai language support
c99 FreeBSD standard C language compiler
sh IRIX a standard/restricted command and programming language
sh Tru64 Shell, the standard command language interpreter
bsh IRIX shell, the standard/job control command programming language
rksh HP-UX shell, the standard/restricted command programming language
ksh HP-UX shell, the standard/restricted command programming language
hebrew Tru64 Introduction to Hebrew language support
german Tru64 Introduction to German language support
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service