mbrlen - get number of bytes consisting a multibyte character
(restartable)
Standard C Library (libc, -lc)
#include <wchar.h>
int
mbrlen(const char * restrict s, size_t n, mbstate_t * restrict ps);
The mbrlen() function usually determines the number of bytes consisting
in a multibyte character pointed by s and return it. This function shall
only examine max n bytes of the array beginning from s.
mbrlen() is equivalent to the following call (except ps is evaluated only
once):
mbrtowc(NULL, s, n, (ps != NULL) ? ps : &internal);
Here, internal is an internal state object.
In state-dependent encodings, s may point the special sequence bytes to
change the shift-state. Although such sequence bytes corresponds to no
individual wide-character code, these affect the conversion state object
pointed by ps, and the mbrlen() treats the special sequence bytes as if
these are a part of the subsequent multibyte character.
Unlike mblen(3), the mbrlen() may accept the byte sequence being not complete
character but possible to consist a part of a valid character. In
this case, this function will accept the all such bytes and save them
into the conversion state object pointed by ps. They will be used at the
subsequent call of this function to restart the conversion suspended.
The behaviour of the mbrlen() is affected by LC_CTYPE category of the
current locale.
There are the special cases:
s == NULL The mbrlen() sets the conversion state object pointed by ps
to an initial state and always return 0. Unlike mblen(3),
the value returned does not indicate whether the current
encoding of the locale is state-dependent.
In this case, the mbrlen() ignores n.
n == 0 In this case, the first n bytes of the array pointed by s
never form a complete character. Thus, the mbrlen() always
returns (size_t)-2.
ps == NULL The mbrlen() uses its own internal state object to keep the
conversion state, instead of ps mentioned in this manual
page.
Calling any other functions in the Standard C Library (libc,
-lc) never change the internal state of the mbrlen(), except
for calling setlocale(3) with changing LC_CTYPE category of
the current locale. Such setlocale(3) call causes the internal
state of this function to be indeterminate. This internal
state is initialized at startup time of the program.
The mbrlen() returns:
0 s points a null byte ('\0').
positive The value returned is a number of bytes for the valid multibyte
character pointed by s. There is no cases that this
value is greater than n or the value of MB_CUR_MAX macro.
(size_t)-2 s points the byte sequence which is possible to consist a
part of valid multibyte character but incomplete. When n is
at least MB_CUR_MAX, this case can only occur if the array
pointed s contains redundant shift sequence.
(size_t)-1 s points a illegal byte sequence which does not form a valid
multibyte character. In this case, the mbrtowc() sets errno
to indicate the error.
The mbrlen() may causes an error in the following case:
[EILSEQ] s points an invalid multibyte character.
[EINVAL] ps points an invalid or uninitialized mbstate_t
object.
mblen(3), mbrtowc(3), setlocale(3)
The mbrlen() function conforms to . The restrict qualifier is added at .
BSD February 3, 2002 BSD
[ Back ] |