pcrs man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

PCRS(3)								       PCRS(3)

NAME
       pcrs - Perl-compatible regular substitution.

SYNOPSIS
       #include <pcrs.h>

       pcrs_job *pcrs_compile(const char *pattern,
	    const char *substitute, const char *options,
	    int *errptr);

       pcrs_job *pcrs_compile_command(const char *command,
	    int *errptr);

       int pcrs_execute(pcrs_job *job, char *subject,
	    int subject_length, char **result,
	    int *result_length);

       int pcrs_execute_list (pcrs_job *joblist, char *subject,
	    int subject_length, char **result,
	    int *result_length);

       pcrs_job *pcrs_free_job(pcrs_job *job);

       void pcrs_free_joblist(pcrs_job *joblist);

       char *pcrs_strerror(int err);

DESCRIPTION
       The PCRS library is a supplement to the PCRE(3) library that implements
       regular expression based substitution, like provided by	Perl(1)'s  's'
       operator.  It uses the same syntax and semantics as Perl 5, with just a
       few differences (see below).

       In a first step, the information on a substitution, i.e.	 the  pattern,
       the  substitute	and  the  options  are compiled from Perl syntax to an
       internal form called pcrs_job by using  either  the  pcrs_compile()  or
       pcrs_compile_command() functions.

       Once  the  job is compiled, it can be used on subjects, which are arbi‐
       trary memory  areas  containing	string	or  binary  data,  by  calling
       pcrs_execute().	Jobs can be chained to joblists and whole joblists can
       be applied to a subject using pcrs_execute_list().

       There are also convenience functions  for  freeing  the	jobs  and  for
       errno-to-string conversion, namely pcrs_free_job(), pcrs_free_joblist()
       and pcrs_strerror().

COMPILING JOBS
       The function pcrs_compile() is called to compile a pcrs_job from a pat‐
       tern,  substitute and options string.  The resulting pcrs_job structure
       is dynamically allocated and it is the caller's responsibility to  call
       pcrs_free_job()	when it's no longer needed.

       pcrs_compile_command()  is a convenience wrapper function that parses a
       Perl command of the form s/pattern/substitute/[options] into its compo‐
       nents  and  then calls pcrs_compile(). As in Perl, you are not bound to
       the '/' character: Whatever follows the 's' will be used as the	delim‐
       iter.  Patterns or substitutes that contain the delimiter need to quote
       it: s/th\/is/th\/at/ will replace th/is by th/at	 and  can  be  written
       more simply as s|th/is|th/at|.

       pattern,	 substitute,  options  and  command  must be zero-terminated C
       strings. substitute and options may be NULL, in	which  case  they  are
       treated like the empty string.

   Return value and diagnostics
       On  success,  both  functions return a pointer to the compiled job.  On
       failure, NULL is returned. In that case, the pcrs error code is written
       to *err.

   Patterns
       For the syntax of the pattern, see the PCRE(3) manual page.

   Substitutes
       The  substitute	uses Perl syntax as documented in the perlre(1) manual
       page, with some exceptions:

       Most notably and evidently, since PCRS is not Perl, variable interpola‐
       tion  or	 Perl command substitution won't work.	Special variables that
       do get interpolated, are:

       $1, $2, ..., $n
	      Like in Perl, these variables refer to what  the	nth  capturing
	      subpattern in the pattern matched.

       $& and $0
	      refer  to	 the whole match. Note that $0 is deprecated in recent
	      Perl versions and now refers to the program name.

       $+     refers to what the last capturing subpattern matched.

       $` and $' (backtick and tick)
	      refer to the areas of the subject before and  after  the	match,
	      respectively.   Note  that, like in Perl, the unmodified subject
	      is used, even if a global substitution previously matched.

       Perl4-style references to subpattern matches of the form	 \1,  \2,  ...
       which  only  exist  in  Perl5 for backwards compatibility, are not sup‐
       ported.

       Also, since the substitute is a double-quoted string in Perl, you might
       expect  all  Perl syntax for special characters to apply. In fact, only
       the following are supported:

       \n     newline (0x0a)

       \r     carriage return (0x0d)

       \t     horizontal tab (0x09)

       \f     form feed (0x0c)

       \b     backspace (0x08)

       \a     alarm, bell (0x07)

       \e     escape (0x1b)

       \0     binary zero (0x00)

   Options
       The options gmisx are supported. e is not, since	 it  would  require  a
       Perl  interpreter  and  neither is o, because the pattern is explicitly
       compiled, anyway. Additionally, PCRS honors the options U and T.	 Where
       PCRE  options are mentioned below, refer to PCRE(3) for the subtle dif‐
       ferences to Perl behaviour.

       g      Replace all instances of pattern in subject, not just the	 first
	      one.

       i      Match  the  pattern  without respect to case. This translates to
	      PCRE_CASELESS.

       m      Treat the subject as consisting of  multiple  lines,  i.e.   '^'
	      matches  immediately after, and '$' immediately before each new‐
	      line.  Translates to PCRE_MULTILINE.

       s      Treat the subject as consisting of one single  line,  i.e.   let
	      the scope of the '.' metacharacter include newlines.  Translates
	      to PCRE_DOTALL.

       x      Allow  extended  regular	expression  syntax  in	the   pattern,
	      enabling	whitespace  and	 comments in complex patterns.	Trans‐
	      lates to PCRE_EXTENDED.

       U      Switch the default behaviour of the '*' and '+'  quantifiers  to
	      ungreedy.	 Note that appending a '?' switches back to greedy(!).
	      The explicit in-pattern switches (?U)  and  (?-U)	 remain	 unaf‐
	      fected.  Translates to PCRE_UNGREEDY.

       T      Consider	the substitute trivial, i.e. do not interpret any ref‐
	      erences or special character escape sequences in the substitute.
	      Handy for large user-supplied substitutes, which would otherwise
	      have to be examined and properly quoted.

       Unsupported options are silently ignored.

EXECUTING JOBS
       Calling pcrs_execute() produces a modified  copy	 of  the  subject,  in
       which the first (or all, if the 'g' option was given when compiling the
       job) occurance(s) of the job's pattern in the subject  is  replaced  by
       the job's substitute.

       The  first  subject_length  bytes following subject are processed, so a
       subject_length that exceeds the actual subject is dangerous.  Note that
       for  zero-terminated  C	strings,  you  should  set  subject_length  to
       strlen(subject), so that the dollar metacharacter matches at the end of
       the  string,  not  after	 the  string-terminating null byte. For conve‐
       nience, an extra null byte is appended to the result so it can again be
       used as a string.

       The  subject  itself  is left untouched, and the *result is dynamically
       allocated, so it is the caller's responsibility to free() it when  it's
       no longer needed.

       The  result's  length  (excluding  the  extra  null byte) is written to
       *result_length.

       If the job matched, the PCRS_SUCCESS flag in job->flags is set.

   String subjects
       If your

   Return value and diagnostics
       On success, pcrs_execute() returns the  number  of  substitutions  that
       were  made,  which  is  limited	to 0 or 1 for non-global searches.  On
       failure, a negative error code is returned and result is set to NULL.

FREEING JOBS
       It is not sufficient to call free() on a pcrs_job, because it  contains
       pointers	   to	 other	  dynamically	allocated   structures.	   Use
       pcrs_free_job() instead. It is safe to pass NULL pointers (or  pointers
       to  invalid  pcrs_jobs  that  contain NULL pointers to dependant struc‐
       tures) to pcrs_free_job().

   Return value
       The value of the job's next pointer.

CHAINING JOBS
       PCRS supports to some extent the chaining of multiple  pcrs_job	struc‐
       tures by means of their next member.

       Chaining	 the  jobs is up to you, but once you have built a linked list
       of jobs, you can execute a whole joblist on a given subject by a single
       call  to	 pcrs_execute_list(),  which  will  sequentially  traverse the
       linked list until it reaches a NULL pointer,  and  call	pcrs_execute()
       for  each  job  it  encounters, feeding the result and result_length of
       each call into the next as the subject and subject_length.  As  in  the
       single  job  case,  the	original  subject  remains  untouched, but all
       interim results are of course free()d. The return value is the  accumu‐
       lated  number  of matches for all jobs in the joblist.  Note that while
       this is handy, it reduces the diagnostic value of err, since you	 won't
       know which job failed.

       In  analogy,  you  can  free  all  jobs	in  a given joblist by calling
       pcrs_free_joblist().

QUOTING
       The quote character is (surprise!) '\'. It quotes the  delimiter	 in  a
       command,	 the  '$'  in  a substitute, and, of course, itself. Note that
       the '$' doesn't need to be quoted if it isn't followed by [0-9+'`&].

       For quoting in the pattern, please refer to PCRE(3).

DIAGNOSTICS
       When compiling a job either via the pcrs_compile() or pcrs_compile_com‐
       mand()  functions,  you	know  that  something  went wrong when you are
       returned a NULL pointer.	 In that case, or in the  event	 of  non-fatal
       warnings,  the integer pointed to by err contains a nonzero error code,
       which is either a passed-through PCRE error code or  one	 generated  by
       PCRS.  Under normal circumstances, it can take the following values:

       PCRE_ERROR_NOMEMORY
	      While compiling the pattern, PCRE ran out of memory.

       PCRS_ERR_NOMEM
	      While compiling the job, PCRS ran out of memory.

       PCRS_ERR_CMDSYNTAX
	      pcrs_compile_command() didn't find four tokens while parsing the
	      command.

       PCRS_ERR_STUDY
	      A PCRE error occured while studying the compiled pattern.	 Since
	      pcre_study()  only  provides textual diagnostic information, the
	      details are lost.

       PCRS_WARN_BADREF
	      The substitute contains a reference to  a	 capturing  subpattern
	      that has a higher index than the number of capturing subpatterns
	      in the pattern or that exceeds the current hard limit of 33 (See
	      LIMITATIONS below). As in Perl, this is non-fatal and results in
	      substitutions with the empty string.

       When executing jobs via pcrs_execute() or pcrs_execute_list(), a	 nega‐
       tive  return  code  indicates  an error. In that case, *result is NULL.
       Possible error codes are:

       PCRE_ERROR_NOMEMORY
	      While matching the pattern, PCRE ran out	of  memory.  This  can
	      only  happen if there are more than 33 backrefrences in the pat‐
	      tern(!)  and memory is too tight to extend storage for more.

       PCRS_ERR_NOMEM
	      While executing the job, PCRS ran out of memory.

       PCRS_ERR_BADJOB
	      The pcrs_job*  passed to pcrs_execute was NULL, or  the  job  is
	      bogus (it contains NULL pointers to the compiled pattern, extra,
	      or substitute).

       If you see any other PCRE error	code  passed  through,	you've	either
       messed with the compiled job or found a bug in PCRS.  Please send me an
       email.

       Ah, and don't look for PCRE_ERROR_NOMATCH, since this is not  an	 error
       in the context of PCRS.	Should there be no match, an exact copy of the
       subject is found at *result and the return code is 0 (matches).

       All error codes can be translated into human readable text by means  of
       the pcrs_strerror() function.

EXAMPLE
       A trivial command-line test program for PCRS might look like:

       #include <pcrs.h>
       #include <stdio.h>

       int main(int Argc, char **Argv)
       {
	  pcrs_job *job;
	  char *result;
	  size_t newsize;
	  int err;

	  if (Argc != 3)
	  {
	     fprintf(stderr, "Usage: %s s/pattern/substitute/[options]	subject\n", Argv[0]);
	     return 1;
	  }

	  if (NULL == (job = pcrs_compile_command(Argv[1], &err)))
	  {
	     fprintf(stderr, "%s: compile error:  %s (%d).\n", Argv[0], pcrs_strerror(err), err);
	  }

	  if (0 > (err = pcrs_execute(job, Argv[2], strlen(Argv[2]), &result, &newsize)))
	  {
	     fprintf(stderr, "%s: exec error:  %s (%d).\n", Argv[0], pcrs_strerror(err), err);
	  }
	  else
	  {
	     printf("Result: *%s*\n", result);
	     free(result);
	  }

	  pcrs_free_job(job);
	  return(err < 0);

       }

LIMITATIONS
       The number of matches that a global job can have is only limited by the
       available memory. An initial storage for 40 matches is reserved,	 which
       is dynamically resized by the factor 1.6 whenever it is exhausted.

       The  number  of capturing subpatterns is currently limited to 33, which
       is a Bad Thing[tm]. It should be dynamically expanded until it  reaches
       the PCRE limit of 99.
       This  limitation	 is particularly embarassing since PCRE 3.5 has raised
       the capturing subpattern limit to 65K.

       All of the above values can be adjusted in the  "Capacity"  section  of
       pcrs.h.

       The  Perl-style escape sequences for special characters \nnn, \xnn, and
       \cX are currently unsupported.

BUGS
       This library has only been tested in the context of one application and
       should be considered high risk.

HISTORY
       PCRS    was    originally    written    for    the    Privoxy   project
       (http://www.privoxy.org/).

SEE ALSO
       PCRE(3), perl(1), perlre(1)

AUTHOR
       PCRS is Copyright 2000 - 2003 by	 Andreas  Oesterhelt  <andreas@oester‐
       helt.org>  and  is  licensed  under the terms of the GNU Lesser General
       Public License (LGPL), version 2.1, which should be  included  in  this
       distribution,  with  the	 exception that the permission to replace that
       license with the GNU General Public License (GPL) given in section 3 is
       restricted to version 2 of the GPL.

       If  it is missing from this distribution, the LGPL can be obtained from
       http://www.gnu.org/licenses/lgpl.html or by mail:  Write	 to  the  Free
       Software	 Foundation,  Inc.,  59	 Temple	 Place - Suite 330, Boston, MA
       02111-1307, USA.

pcrs-0.0.3			2 December 2003			       PCRS(3)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net