*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->Tru64 Unix man pages -> regexp (3)              
Title
Content
Arch
Section
 

regexp(3)

Contents


NAME    [Toc]    [Back]

       advance,  advance_r,  compile,  compile_r,  step, step_r -
       Regular expression compile and match routines

SYNOPSIS    [Toc]    [Back]

       #define INIT declarations #define GETC getc  code  #define
       PEEKC  peek  code  #define  UNGETC(c)  ungetc code #define
       RETURN(ptr) return code #define ERROR(val) error code

       #include <regexp.h>

       char *compile(
               char *instring,
               char *expbuf,
               const char *endbuf,
               int eof ); int step(
               const char *string,
               const char *expbuf ); int advance(
               const char *string,
               const char *expbuf );

       extern char *loc1, *loc2, *locs;

       The following functions do not conform  to  current  standards
  and  are supported only for backward compatibility:
       char *compile_r(
               char *instring,
               char *expbuf,
               char *endbuf,
               int eof,
               struct regexp_data *regexp_data ); int advance_r(
               char *string,
               char *expbuf,
               struct regexp_data *regexp_data ); int step_r(
               char *string,
               char *expbuf,
               struct regexp_data *regexp_data );

STANDARDS    [Toc]    [Back]

       Interfaces documented on this reference  page  conform  to
       industry standards as follows:

       advance(), compile(), step(): XSH4.2

       Refer to the standards(5) reference page for more information
 about industry standards and associated tags.

PARAMETERS    [Toc]    [Back]

       The value of the next  character  (byte)  in  the  regular
       expression  pattern.  Returned  by  the  next  call to the
       GETC() and PEEKC() macros.  Specifies  a  pointer  to  the
       character  following  the  last  character of the compiled
       regular expression.  Specifies an error value.   Specifies
       a string to be passed to the compile() function.

              The  instring parameter is never used explicitly by
              the compile() function, but you can use it in  your
              macros.   For  example,  you  may  want to pass the
              string containing a pattern as the instring parameter
  to  the  compile() function and use the INIT()
              macro to set a pointer to  the  beginning  of  this
              string.  When your macros do not use instring, call
              the compile() function with a value of ((char *) 0)
              for  this  parameter.   Points to a character array
              where the compiled regular  expression  is  stored.
              Points to the location that immediately follows the
              character array where the compiled regular  expression
 is stored. When the compiled expression cannot
              be contained in (endbuf-expbuf) number of bytes,  a
              call  to  the  ERROR(_BIGREGEXP) macro is made (see
              the ERRORS section).  Specifies the character  that
              marks  the end of the regular expression. For example,
 in ed this character is usually a  /  (slash).
              Points  to  a NULL terminated string of characters,
              in the step() function, to be searched for a match.
              Is   data   for   the  compile_r(),  step_r(),  and
              advance_r() functions.

DESCRIPTION    [Toc]    [Back]

       The compile(), advance(), and step()  functions  are  used
       for general-purpose expression matching.

       The  compile()  function takes a simple regular expression
       as input and produces a compiled expression  that  can  be
       used with the step() and advance() functions.

       The  following six macros, used in the compile() function,
       must be defined before the #include  <regexp.h>  statement
       in  programs.  The  GETC(),  PEEKC(),  and UNGETC() macros
       operate on the regular expression provided  as  input  for
       the  compile()  function.   The  INIT()  macro is used for
       dependent declarations and initializations.  In  the  regexp.h
  header  file  this macro is located right after the
       compile()  function  declarations  and  opening  {   (left
       brace).  Your INIT() declarations must end with a ; (semicolon).


              The INIT() macro is frequently used to set a register
 variable to point to the beginning of the regular
 expression, so that this pointer can be used in
              declarations  for  GETC(),  PEEKC(),  and UNGETC().
              Alternatively, you can use INIT() to declare external
  variables  that  GETC(), PEEKC(), and UNGETC()
              need.  The GETC() macro returns the  value  of  the
              next  character  (byte)  in  the regular-expression
              pattern. Successive calls to GETC() return  successive
  characters  of  the  regular expression.  The
              PEEKC() macro returns the next character (byte)  in
              the regular expression.  Immediate subsequent calls
              to this macro return the same byte, which  is  also
              the  next  character  returned by the GETC() macro.
              The UNGETC() macro causes the  c  parameter  to  be
              returned by the next call to the GETC() and PEEKC()
              macros. No more than one character of  pushback  is
              ever needed because this character is guaranteed to
              be the last character read by the GETC() macro. The
              value of the UNGETC() macro is always ignored.  The
              RETURN() macro is used for normal exit of the  compile()
  function. The value of the ptr parameter is
              a pointer to the character following the last character
  of  the compiled regular expression. This is
              useful in programs that manage  memory  allocation.
              The  ERROR()  macro is the abnormal return from the
              compile() function. A call  to  this  macro  should
              never  return  a  value.  In  this macro, val is an
              error number, which is described in the ERRORS section
 of this reference page.

       The  step()  function  finds  the  first  substring of the
       string parameter  that  matches  the  compiled  expression
       pointed  to  by  the  expbuf  parameter.  When there is no
       match, the step() function returns a value  of  0  (zero).
       When  there  is  a  match,  the  step() function returns a
       nonzero value and  sets  two  global  character  pointers:
       loc1, which points to the first character of the substring
       that matches the pattern, and loc2, which  points  to  the
       character immediately following the substring that matches
       the pattern.  When  the  regular  expression  matches  the
       entire  expression,  loc1 points to the first character of
       the string parameter and loc2 points to the NULL character
       at  the  end  of  the  expression  specified by the string
       parameter.

       The step() function uses the integer variable circf, which
       is  set by the compile() function when the regular expression
 begins with a ^ (circumflex).  When this variable  is
       set,  the  step() function only tries to match the regular
       expression to the beginning of the string. When  you  compile
 more than one regular expression before executing the
       first one, save the  value  of  circf  for  each  compiled
       expression  and  set  circf to the saved value before each
       call to step().

       The advance() function tests whether an initial  substring
       of  the string parameter matches the expression pointed to
       by the expbuf parameter. Using the  same  parameters  that
       were passed to it, the step() function calls the advance()
       function. The step() function increments a pointer through
       the  string parameter characters and calls advance() until
       a nonzero value, which indicates a match, is returned,  or
       until  the  end of the expression pointed to by the string
       parameter is reached. To unconditionally constrain  string
       to  point  to  the  beginning  of the expression, call the
       advance() function directly instead of calling step().

       When the advance() function encounters an * (asterisk)  or
       a \{\} sequence in the regular expression, it advances its
       pointer to the string to be matched as far as possible and
       recursively calls itself, trying to match the remainder of
       the regular expression. As long as there is no match,  the
       advance()  function  backs  up  along the string until the
       function finds a match or reaches the point in the  string
       where  the  initial  match  with  the  * or \{\} character
       occurred.

       It is sometimes desirable to stop this backing  up  before
       the  initial  pointer  position  in the string is reached.
       When the locs global character pointer is matched with the
       character at the pointer position in the string during the
       backing-up process, the advance() function breaks  out  of
       the  recursive  loop that backs up and returns the value 0
       (zero).

       The compile_r(), step_r(), and advance_r()  functions  are
       the  reentrant  versions  of  the  compile(),  step(), and
       advance() functions. They are supported in order to  maintain
 backward compatibility with operating system versions
       prior to Tru64 UNIX Version 4.0.

       The regexp.h header file defines  the  regexp_data  structure.




NOTES    [Toc]    [Back]

       This  interface  has  been deprecated in favor of the regcomp()
 interface specified by the POSIX and  X/Open  standards
  and may be retired. If possible, you should migrate
       regexp()  regular  expression  routines  to  the  routines
       offered  under the regcomp() and regexec() interfaces (see
       regcomp(3)).

       The regexp interface  is  provided  to  support  System  V
       applications.  Traditional  BSD applications use different
       functions  for  regular  expression  handling.   See   the
       re_comp(3) and re_exec(3) reference pages.

       The  advance(), compile(), and step() functions are scheduled
 to be withdrawn from a future version of  the  X/Open
       CAE Specification.

RETURN VALUES    [Toc]    [Back]

       Upon  successful  completion, the compile() function calls
       the RETURN() macro. Upon failure, this function calls  the
       ERROR() macro.

       Whenever   a  successful  match  occurs,  the  step()  and
       advance() functions return a nonzero value. Upon  failure,
       these functions return a value of 0 (zero).

       [Tru64  UNIX]  The  compile_r(), step_r(), and advance_r()
       functions return the same values  as  their  non-reentrant
       counterparts.

ERRORS    [Toc]    [Back]

       If  any  of the following conditions occurs, the compile()
       or compile_r() functions call the ERROR()  macro  with  an
       error  value  as  its  argument: The range endpoint is too
       large.  A bad number was received.  The number  in  \digit
       is  out  of  range.  There is an illegal or missing delimiter.
  There is no remembered search string.  The use of a
       pair  of  \(  and \) is unbalanced.  There are too many \(
       and \) pairs (exceeds the maximum value set for  _NBRA  in
       regexp.h,  usually 9).  More than two numbers are given in
       the \{ and \} pair.  A } character was expected after a \.
       The first number exceeds the second in the \{ and \} pair.
       There is a [ ] pair imbalance.  There is a regular expression
  overflow.  [Tru64 UNIX]  There was an unknown error.

EXAMPLES    [Toc]    [Back]

       The following is an  example  of  the  regular  expression
       macros and calls from the grep command:

       #define  INIT         register  char *sp=instring; #define
       GETC         (*sp++)  #define  PEEKC        (*sp)  #define
       UNGETC(c)    (--sp)  #define  RETURN(c)    return; #define
       ERROR(c)    regerr


       #include <regexp.h>
               . . .

       compile (patstr, expbuf, &expbuf[ESIZE], '\0');
               . . .

       if (step (linebuf, expbuf))
               succeed( );
               . . .

SEE ALSO    [Toc]    [Back]

      
      
       Functions:  ctype(3),  fnmatch(3),  glob(3),   regcomp(3),
       re_comp(3)

       Commands: ed(1), sed(1), grep(1)

       Standards: standards(5)



                                                        regexp(3)
[ Back ]
 Similar pages
Name OS Title
wsregexp IRIX Wide character based regular expression compile and match routines
regcmp IRIX regular expression compile
regex Tru64 Compile and execute regular expression
regexp IRIX Match a regular expression against a string
regcmp IRIX compile and execute regular expression
regcmp Tru64 Compile and execute regular expression
regcomp OpenBSD regular expression routines
regfree OpenBSD regular expression routines
regsub OpenBSD regular expression routines
regex OpenBSD regular expression routines
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service