utf2 - FreeBSD

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->FreeBSD man pages -> utf2 (5)

UTF2(5)

NAME [Toc] [Back]

     utf2 -- Universal character set Transformation Format encoding of wide
     characters

SYNOPSIS [Toc] [Back]

     ENCODING "UTF2"

DESCRIPTION [Toc] [Back]

     The UTF2 encoding has been deprecated in favour of UTF-8.	New applications
 should not use UTF2.

     The UTF2 encoding is based on a proposed X-Open multibyte FSS-UCS-TF
     (File System Safe Universal Character Set Transformation Format) encoding
     as used in Plan 9 from Bell Labs.	Although it is capable of representing
     more than 16 bits, the current implementation is limited to 16 bits as
     defined by the Unicode Standard.

     UTF2 representation is backwards compatible with ASCII, so 0x00-0x7f
     refer to the ASCII character set.	The multibyte encodings of wide characters
 between 0x0080 and 0xffff consist entirely of bytes whose high
     order bit is set.	The actual encoding is represented by the following
     table:

     [0x0000 - 0x007f] [00000000.0bbbbbbb] -> 0bbbbbbb
     [0x0080 - 0x07ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb
     [0x0800 - 0xffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb

     If more than a single representation of a value exists (for example,
     0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always
     used (but the longer ones will be correctly decoded).

     The final three encodings provided by X-Open:

     [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] ->
	     11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
	     111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
	     1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb

     which provides for the entire proposed ISO-10646 31 bit standard are currently
 not implemented.

SEE ALSO [Toc] [Back]

     mklocale(1), setlocale(3), utf8(5)


FreeBSD 5.2.1		       October 11, 2002 		 FreeBSD 5.2.1

[ Back ]

Similar pages

Name	OS	Title
euc	FreeBSD	EUC encoding of wide characters
fold_string_w	Tru64	maps one wide-character string to another, performing the specified Unicode character transformation
wmemset	Linux	fill an array of wide-characters with a constant wide character
wcsspn	Linux	advance in a wide-character string, skipping any of a set of wide characters
wcspbrk	Linux	search a wide-character string for any of a set of wide characters
wcscspn	Linux	search a wide-character string for any of a set of wide characters
fwscanf	FreeBSD	wide character input format conversion
swscanf	FreeBSD	wide character input format conversion
wscanf	FreeBSD	wide character input format conversion
vwscanf	FreeBSD	wide character input format conversion

newsletter delivery service

UTF2(5)

Contents

NAME [Toc] [Back]

SYNOPSIS [Toc] [Back]

DESCRIPTION [Toc] [Back]

SEE ALSO [Toc] [Back]