perlsub - IRIX

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->IRIX man pages -> perlsub (1)


PERLSUB(1)							    PERLSUB(1)

NAME [Toc] [Back]

     perlsub - Perl subroutines

SYNOPSIS [Toc] [Back]

     To	declare	subroutines:

	 sub NAME;	       # A "forward" declaration.
	 sub NAME(PROTO);      #  ditto, but with prototypes

	 sub NAME BLOCK	       # A declaration and a definition.
	 sub NAME(PROTO) BLOCK #  ditto, but with prototypes

     To	define an anonymous subroutine at runtime:

	 $subref = sub BLOCK;

     To	import subroutines:

	 use PACKAGE qw(NAME1 NAME2 NAME3);

     To	call subroutines:

	 NAME(LIST);	# & is optional	with parentheses.
	 NAME LIST;	# Parentheses optional if predeclared/imported.
	 &NAME;		# Passes current @_ to subroutine.

DESCRIPTION [Toc] [Back]

     Like many languages, Perl provides	for user-defined subroutines.  These
     may be located anywhere in	the main program, loaded in from other files
     via the do, require, or use keywords, or even generated on	the fly	using
     eval or anonymous subroutines (closures).	You can	even call a function
     indirectly	using a	variable containing its	name or	a CODE reference to
     it, as in $var = \&function.

     The Perl model for	function call and return values	is simple: all
     functions are passed as parameters	one single flat	list of	scalars, and
     all functions likewise return to their caller one single flat list	of
     scalars.  Any arrays or hashes in these call and return lists will
     collapse, losing their identities--but you	may always use pass-byreference
 instead to avoid	this.  Both call and return lists may contain
     as	many or	as few scalar elements as you'd	like.  (Often a	function
     without an	explicit return	statement is called a subroutine, but there's
     really no difference from the language's perspective.)

     Any arguments passed to the routine come in as the	array @_.  Thus	if you
     called a function with two	arguments, those would be stored in $_[0] and
     $_[1].  The array @_ is a local array, but	its elements are aliases for
     the actual	scalar parameters.  In particular, if an element $_[0] is
     updated, the corresponding	argument is updated (or	an error occurs	if it
     is	not updatable).	 If an argument	is an array or hash element which did
     not exist when the	function was called, that element is created only when



									Page 1






PERLSUB(1)							    PERLSUB(1)



     (and if) it is modified or	if a reference to it is	taken.	(Some earlier
     versions of Perl created the element whether or not it was	assigned to.)
     Note that assigning to the	whole array @_ removes the aliasing, and does
     not update	any arguments.

     The return	value of the subroutine	is the value of	the last expression
     evaluated.	 Alternatively,	a return statement may be used to exit the
     subroutine, optionally specifying the returned value, which will be
     evaluated in the appropriate context (list, scalar, or void) depending on
     the context of the	subroutine call.  If you specify no return value, the
     subroutine	will return an empty list in a list context, an	undefined
     value in a	scalar context,	or nothing in a	void context.  If you return
     one or more arrays	and/or hashes, these will be flattened together	into
     one large indistinguishable list.

     Perl does not have	named formal parameters, but in	practice all you do is
     assign to a my() list of these.  Any variables you	use in the function
     that aren't declared private are global variables.	 For the gory details
     on	creating private variables, see	the section on Private Variables via
     my() and the section on Temporary Values via local().  To create
     protected environments for	a set of functions in a	separate package (and
     probably a	separate file),	see the	section	on Packages in the perlmod
     manpage.

     Example:

	 sub max {
	     my	$max = shift(@_);
	     foreach $foo (@_) {
		 $max =	$foo if	$max < $foo;
	     }
	     return $max;
	 }
	 $bestday = max($mon,$tue,$wed,$thu,$fri);

     Example:

	 # get a line, combining continuation lines
	 #  that start with whitespace

	 sub get_line {
	     $thisline = $lookahead;  #	GLOBAL VARIABLES!!
	     LINE: while (defined($lookahead = <STDIN>)) {
		 if ($lookahead	=~ /^[ \t]/) {
		     $thisline .= $lookahead;
		 }
		 else {
		     last LINE;
		 }
	     }
	     $thisline;
	 }



									Page 2






PERLSUB(1)							    PERLSUB(1)



	 $lookahead = <STDIN>;	     # get first line
	 while ($_ = get_line()) {
	     ...
	 }

     Use array assignment to a local list to name your formal arguments:

	 sub maybeset {
	     my($key, $value) =	@_;
	     $Foo{$key}	= $value unless	$Foo{$key};
	 }

     This also has the effect of turning call-by-reference into	call-by-value,
     because the assignment copies the values.	Otherwise a function is	free
     to	do in-place modifications of @_	and change its caller's	values.

	 upcase_in($v1,	$v2);  # this changes $v1 and $v2
	 sub upcase_in {
	     for (@_) {	tr/a-z/A-Z/ }
	 }

     You aren't	allowed	to modify constants in this way, of course.  If	an
     argument were actually literal and	you tried to change it,	you'd take a
     (presumably fatal)	exception.   For example, this won't work:

	 upcase_in("frederick");

     It	would be much safer if the upcase_in() function	were written to	return
     a copy of its parameters instead of changing them in place:

	 ($v3, $v4) = upcase($v1, $v2);	 # this	doesn't
	 sub upcase {
	     return unless defined wantarray;  # void context, do nothing
	     my	@parms = @_;
	     for (@parms) { tr/a-z/A-Z/	}
	     return wantarray ?	@parms : $parms[0];
	 }

     Notice how	this (unprototyped) function doesn't care whether it was
     passed real scalars or arrays.  Perl will see everything as one big long
     flat @_ parameter list.  This is one of the ways where Perl's simple
     argument-passing style shines.  The upcase() function would work
     perfectly well without changing the upcase() definition even if we	fed it
     things like this:

	 @newlist   = upcase(@list1, @list2);
	 @newlist   = upcase( split /:/, $var );

     Do	not, however, be tempted to do this:






									Page 3






PERLSUB(1)							    PERLSUB(1)



	 (@a, @b)   = upcase(@list1, @list2);

     Because like its flat incoming parameter list, the	return list is also
     flat.  So all you have managed to do here is stored everything in @a and
     made @b an	empty list.  See the section on	/"Pass by Reference for
     alternatives.

     A subroutine may be called	using the "&" prefix.  The "&" is optional in
     modern Perls, and so are the parentheses if the subroutine	has been
     predeclared.  (Note, however, that	the "&"	is NOT optional	when you're
     just naming the subroutine, such as when it's used	as an argument to
     defined() or undef().  Nor	is it optional when you	want to	do an indirect
     subroutine	call with a subroutine name or reference using the &$subref()
     or	&{$subref}() constructs.  See the perlref manpage for more on that.)

     Subroutines may be	called recursively.  If	a subroutine is	called using
     the "&" form, the argument	list is	optional, and if omitted, no @_	array
     is	set up for the subroutine: the @_ array	at the time of the call	is
     visible to	subroutine instead.  This is an	efficiency mechanism that new
     users may wish to avoid.

	 &foo(1,2,3);	     # pass three arguments
	 foo(1,2,3);	     # the same

	 foo();		     # pass a null list
	 &foo();	     # the same

	 &foo;		     # foo() get current args, like foo(@_) !!
	 foo;		     # like foo() IFF sub foo predeclared, else	"foo"

     Not only does the "&" form	make the argument list optional, but it	also
     disables any prototype checking on	the arguments you do provide.  This is
     partly for	historical reasons, and	partly for having a convenient way to
     cheat if you know what you're doing.  See the section on Prototypes
     below.

     Private Variables via my()

     Synopsis:

	 my $foo;	     # declare $foo lexically local
	 my (@wid, %get);    # declare list of variables local
	 my $foo = "flurp";  # declare $foo lexical, and init it
	 my @oof = @bar;     # declare @oof lexical, and init it

     A "my" declares the listed	variables to be	confined (lexically) to	the
     enclosing block, conditional (if/unless/elsif/else), loop
     (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd
     file.  If more than one value is listed, the list must be placed in
     parentheses.  All listed elements must be legal lvalues.  Only
     alphanumeric identifiers may be lexically scoped--magical builtins	like
     $/	must currently be localized with "local" instead.



									Page 4






PERLSUB(1)							    PERLSUB(1)



     Unlike dynamic variables created by the "local" statement,	lexical
     variables declared	with "my" are totally hidden from the outside world,
     including any called subroutines (even if it's the	same subroutine	called
     from itself or elsewhere--every call gets its own copy).

     (An eval(), however, can see the lexical variables	of the scope it	is
     being evaluated in	so long	as the names aren't hidden by declarations
     within the	eval() itself.	See the	perlref	manpage.)

     The parameter list	to my()	may be assigned	to if desired, which allows
     you to initialize your variables.	(If no initializer is given for	a
     particular	variable, it is	created	with the undefined value.)  Commonly
     this is used to name the parameters to a subroutine.  Examples:

	 $arg =	"fred";	       # "global" variable
	 $n = cube_root(27);
	 print "$arg thinks the	root is	$n\n";
      fred thinks the root is 3

	 sub cube_root {
	     my	$arg = shift;  # name doesn't matter
	     $arg **= 1/3;
	     return $arg;
	 }

     The "my" is simply	a modifier on something	you might assign to.  So when
     you do assign to the variables in its argument list, the "my" doesn't
     change whether those variables is viewed as a scalar or an	array.	So

	 my ($foo) = <STDIN>;
	 my @FOO = <STDIN>;

     both supply a list	context	to the right-hand side,	while

	 my $foo = <STDIN>;

     supplies a	scalar context.	 But the following declares only one variable:

	 my $foo, $bar = 1;

     That has the same effect as

	 my $foo;
	 $bar =	1;

     The declared variable is not introduced (is not visible) until after the
     current statement.	 Thus,

	 my $x = $x;

     can be used to initialize the new $x with the value of the	old $x,	and
     the expression



									Page 5






PERLSUB(1)							    PERLSUB(1)



	 my $x = 123 and $x == 123

     is	false unless the old $x	happened to have the value 123.

     Lexical scopes of control structures are not bounded precisely by the
     braces that delimit their controlled blocks; control expressions are part
     of	the scope, too.	 Thus in the loop

	 while (defined(my $line = <>))	{
	     $line = lc	$line;
	 } continue {
	     print $line;
	 }

     the scope of $line	extends	from its declaration throughout	the rest of
     the loop construct	(including the continue	clause), but not beyond	it.
     Similarly,	in the conditional

	 if ((my $answer = <STDIN>) =~ /^yes$/i) {
	     user_agrees();
	 } elsif ($answer =~ /^no$/i) {
	     user_disagrees();
	 } else	{
	     chomp $answer;
	     die "'$answer' is neither 'yes' nor 'no'";
	 }

     the scope of $answer extends from its declaration throughout the rest of
     the conditional (including	elsif and else clauses,	if any), but not
     beyond it.

     (None of the foregoing applies to if/unless or while/until	modifiers
     appended to simple	statements.  Such modifiers are	not control structures
     and have no effect	on scoping.)

     The foreach loop defaults to scoping its index variable dynamically (in
     the manner	of local; see below).  However,	if the index variable is
     prefixed with the keyword "my", then it is	lexically scoped instead.
     Thus in the loop

	 for my	$i (1, 2, 3) {
	     some_function();
	 }

     the scope of $i extends to	the end	of the loop, but not beyond it,	and so
     the value of $i is	unavailable in some_function().

     Some users	may wish to encourage the use of lexically scoped variables.
     As	an aid to catching implicit references to package variables, if	you
     say





									Page 6






PERLSUB(1)							    PERLSUB(1)



	 use strict 'vars';

     then any variable reference from there to the end of the enclosing	block
     must either refer to a lexical variable, or must be fully qualified with
     the package name.	A compilation error results otherwise.	An inner block
     may countermand this with "no strict 'vars'".

     A my() has	both a compile-time and	a run-time effect.  At compile time,
     the compiler takes	notice of it; the principle usefulness of this is to
     quiet use strict 'vars'.  The actual initialization is delayed until run
     time, so it gets executed appropriately; every time through a loop, for
     example.

     Variables declared	with "my" are not part of any package and are
     therefore never fully qualified with the package name.  In	particular,
     you're not	allowed	to try to make a package variable (or other global)
     lexical:

	 my $pack::var;	     # ERROR!  Illegal syntax
	 my $_;		     # also illegal (currently)

     In	fact, a	dynamic	variable (also known as	package	or global variables)
     are still accessible using	the fully qualified :: notation	even while a
     lexical of	the same name is also visible:

	 package main;
	 local $x = 10;
	 my    $x = 20;
	 print "$x and $::x\n";

     That will print out 20 and	10.

     You may declare "my" variables at the outermost scope of a	file to	hide
     any such identifiers totally from the outside world.  This	is similar to
     C's static	variables at the file level.  To do this with a	subroutine
     requires the use of a closure (anonymous function).  If a block (such as
     an	eval(),	function, or package) wants to create a	private	subroutine
     that cannot be called from	outside	that block, it can declare a lexical
     variable containing an anonymous sub reference:

	 my $secret_version = '1.001-beta';
	 my $secret_sub	= sub {	print $secret_version };
	 &$secret_sub();

     As	long as	the reference is never returned	by any function	within the
     module, no	outside	module can see the subroutine, because its name	is not
     in	any package's symbol table.  Remember that it's	not REALLY called
     $some_pack::secret_version	or anything; it's just $secret_version,
     unqualified and unqualifiable.






									Page 7






PERLSUB(1)							    PERLSUB(1)



     This does not work	with object methods, however; all object methods have
     to	be in the symbol table of some package to be found.

     Just because the lexical variable is lexically (also called statically)
     scoped doesn't mean that within a function	it works like a	C static.  It
     normally works more like a	C auto.	 But here's a mechanism	for giving a
     function private variables	with both lexical scoping and a	static
     lifetime.	If you do want to create something like	C's static variables,
     just enclose the whole function in	an extra block,	and put	the static
     variable outside the function but in the block.

	 {
	     my	$secret_val = 0;
	     sub gimme_another {
		 return	++$secret_val;
	     }
	 }
	 # $secret_val now becomes unreachable by the outside
	 # world, but retains its value	between	calls to gimme_another

     If	this function is being sourced in from a separate file via require or
     use, then this is probably	just fine.  If it's all	in the main program,
     you'll need to arrange for	the my() to be executed	early, either by
     putting the whole block above your	main program, or more likely, placing
     merely a BEGIN sub	around it to make sure it gets executed	before your
     program starts to run:

	 sub BEGIN {
	     my	$secret_val = 0;
	     sub gimme_another {
		 return	++$secret_val;
	     }
	 }

     See the perlrun manpage about the BEGIN function.

     Temporary Values via local()

     NOTE: In general, you should be using "my"	instead	of "local", because
     it's faster and safer.  Exceptions	to this	include	the global punctuation
     variables,	filehandles and	formats, and direct manipulation of the	Perl
     symbol table itself.  Format variables often use "local" though, as do
     other variables whose current value must be visible to called
     subroutines.

     Synopsis:

	 local $foo;		     # declare $foo dynamically	local
	 local (@wid, %get);	     # declare list of variables local
	 local $foo = "flurp";	     # declare $foo dynamic, and init it
	 local @oof = @bar;	     # declare @oof dynamic, and init it




									Page 8






PERLSUB(1)							    PERLSUB(1)



	 local *FH;		     # localize	$FH, @FH, %FH, &FH  ...
	 local *merlyn = *randal;    # now $merlyn is really $randal, plus
				     #	   @merlyn is really @randal, etc
	 local *merlyn = 'randal';   # SAME THING: promote 'randal' to *randal
	 local *merlyn = \$randal;   # just alias $merlyn, not @merlyn etc

     A local() modifies	its listed variables to	be local to the	enclosing
     block, (or	subroutine, eval{}, or do) and any called from within that
     block.  A local() just gives temporary values to global (meaning package)
     variables.	 This is known as dynamic scoping.  Lexical scoping is done
     with "my",	which works more like C's auto declarations.

     If	more than one variable is given	to local(), they must be placed	in
     parentheses.  All listed elements must be legal lvalues.  This operator
     works by saving the current values	of those variables in its argument
     list on a hidden stack and	restoring them upon exiting the	block,
     subroutine, or eval.  This	means that called subroutines can also
     reference the local variable, but not the global one.  The	argument list
     may be assigned to	if desired, which allows you to	initialize your	local
     variables.	 (If no	initializer is given for a particular variable,	it is
     created with an undefined value.)	Commonly this is used to name the
     parameters	to a subroutine.  Examples:

	 for $i	( 0 .. 9 ) {
	     $digits{$i} = $i;
	 }
	 # assume this function	uses global %digits hash
	 parse_num();

	 # now temporarily add to %digits hash
	 if ($base12) {
	     # (NOTE: not claiming this	is efficient!)
	     local %digits  = (%digits,	't' => 10, 'e' => 11);
	     parse_num();  # parse_num gets this new %digits!
	 }
	 # old %digits restored	here

     Because local() is	a run-time command, it gets executed every time
     through a loop.  In releases of Perl previous to 5.0, this	used more
     stack storage each	time until the loop was	exited.	 Perl now reclaims the
     space each	time through, but it's still more efficient to declare your
     variables outside the loop.

     A local is	simply a modifier on an	lvalue expression.  When you assign to
     a localized variable, the local doesn't change whether its	list is	viewed
     as	a scalar or an array.  So

	 local($foo) = <STDIN>;
	 local @FOO = <STDIN>;

     both supply a list	context	to the right-hand side,	while




									Page 9






PERLSUB(1)							    PERLSUB(1)



	 local $foo = <STDIN>;

     supplies a	scalar context.

     A note about local() and composite	types is in order.  Something like
     local(%foo) works by temporarily placing a	brand new hash in the symbol
     table.  The old hash is left alone, but is	hidden "behind"	the new	one.

     This means	the old	variable is completely invisible via the symbol	table
     (i.e. the hash entry in the *foo typeglob)	for the	duration of the
     dynamic scope within which	the local() was	seen.  This has	the effect of
     allowing one to temporarily occlude any magic on composite	types.	For
     instance, this will briefly alter a tied hash to some other
     implementation:

	 tie %ahash, 'APackage';
	 [...]
	 {
	    local %ahash;
	    tie	%ahash,	'BPackage';
	    [..called code will	see %ahash tied	to 'BPackage'..]
	    {
	       local %ahash;
	       [..%ahash is a normal (untied) hash here..]
	    }
	 }
	 [..%ahash back	to its initial tied self again..]

     As	another	example, a custom implementation of %ENV might look like this:

	 {
	     local %ENV;
	     tie %ENV, 'MyOwnEnv';
	     [..do your	own fancy %ENV manipulation here..]
	 }
	 [..normal %ENV	behavior here..]


     Passing Symbol Table Entries (typeglobs)    [Toc]    [Back]

     [Note:  The mechanism described in	this section was originally the	only
     way to simulate pass-by-reference in older	versions of Perl.  While it
     still works fine in modern	versions, the new reference mechanism is
     generally easier to work with.  See below.]

     Sometimes you don't want to pass the value	of an array to a subroutine
     but rather	the name of it,	so that	the subroutine can modify the global
     copy of it	rather than working with a local copy.	In perl	you can	refer
     to	all objects of a particular name by prefixing the name with a star:
     *foo.  This is often known	as a "typeglob", because the star on the front
     can be thought of as a wildcard match for all the funny prefix characters
     on	variables and subroutines and such.



								       Page 10






PERLSUB(1)							    PERLSUB(1)



     When evaluated, the typeglob produces a scalar value that represents all
     the objects of that name, including any filehandle, format, or
     subroutine.  When assigned	to, it causes the name mentioned to refer to
     whatever "*" value	was assigned to	it.  Example:

	 sub doubleary {
	     local(*someary) = @_;
	     foreach $elem (@someary) {
		 $elem *= 2;
	     }
	 }
	 doubleary(*foo);
	 doubleary(*bar);

     Note that scalars are already passed by reference,	so you can modify
     scalar arguments without using this mechanism by referring	explicitly to
     $_[0] etc.	 You can modify	all the	elements of an array by	passing	all
     the elements as scalars, but you have to use the *	mechanism (or the
     equivalent	reference mechanism) to	push, pop, or change the size of an
     array.  It	will certainly be faster to pass the typeglob (or reference).

     Even if you don't want to modify an array,	this mechanism is useful for
     passing multiple arrays in	a single LIST, because normally	the LIST
     mechanism will merge all the array	values so that you can't extract out
     the individual arrays.  For more on typeglobs, see	the section on
     Typeglobs and Filehandles in the perldata manpage.

     Pass by Reference    [Toc]    [Back]

     If	you want to pass more than one array or	hash into a function--or
     return them from it--and have them	maintain their integrity, then you're
     going to have to use an explicit pass-by-reference.  Before you do	that,
     you need to understand references as detailed in the perlref manpage.
     This section may not make much sense to you otherwise.

     Here are a	few simple examples.  First, let's pass	in several arrays to a
     function and have it pop all of then, return a new	list of	all their
     former last elements:

	 @tailings = popmany ( \@a, \@b, \@c, \@d );

	 sub popmany {
	     my	$aref;
	     my	@retlist = ();
	     foreach $aref ( @_	) {
		 push @retlist,	pop @$aref;
	     }
	     return @retlist;
	 }

     Here's how	you might write	a function that	returns	a list of keys
     occurring in all the hashes passed	to it:



								       Page 11






PERLSUB(1)							    PERLSUB(1)



	 @common = inter( \%foo, \%bar,	\%joe );
	 sub inter {
	     my	($k, $href, %seen); # locals
	     foreach $href (@_)	{
		 while ( $k = each %$href ) {
		     $seen{$k}++;
		 }
	     }
	     return grep { $seen{$_} ==	@_ } keys %seen;
	 }

     So	far, we're using just the normal list return mechanism.	 What happens
     if	you want to pass or return a hash?  Well, if you're using only one of
     them, or you don't	mind them concatenating, then the normal calling
     convention	is ok, although	a little expensive.

     Where people get into trouble is here:

	 (@a, @b) = func(@c, @d);
     or
	 (%a, %b) = func(%c, %d);

     That syntax simply	won't work.  It	sets just @a or	%a and clears the @b
     or	%b.  Plus the function didn't get passed into two separate arrays or
     hashes: it	got one	long list in @_, as always.

     If	you can	arrange	for everyone to	deal with this through references,
     it's cleaner code,	although not so	nice to	look at.  Here's a function
     that takes	two array references as	arguments, returning the two array
     elements in order of how many elements they have in them:

	 ($aref, $bref)	= func(\@c, \@d);
	 print "@$aref has more	than @$bref\n";
	 sub func {
	     my	($cref,	$dref) = @_;
	     if	(@$cref	> @$dref) {
		 return	($cref,	$dref);
	     } else {
		 return	($dref,	$cref);
	     }
	 }

     It	turns out that you can actually	do this	also:












								       Page 12






PERLSUB(1)							    PERLSUB(1)



	 (*a, *b) = func(\@c, \@d);
	 print "@a has more than @b\n";
	 sub func {
	     local (*c,	*d) = @_;
	     if	(@c > @d) {
		 return	(\@c, \@d);
	     } else {
		 return	(\@d, \@c);
	     }
	 }

     Here we're	using the typeglobs to do symbol table aliasing.  It's a tad
     subtle, though, and also won't work if you're using my() variables,
     because only globals (well, and local()s) are in the symbol table.

     If	you're passing around filehandles, you could usually just use the bare
     typeglob, like *STDOUT, but typeglobs references would be better because
     they'll still work	properly under use strict 'refs'.  For example:

	 splutter(\*STDOUT);
	 sub splutter {
	     my	$fh = shift;
	     print $fh "her um well a hmmm\n";
	 }

	 $rec =	get_rec(\*STDIN);
	 sub get_rec {
	     my	$fh = shift;
	     return scalar <$fh>;
	 }

     Another way to do this is using *HANDLE{IO}, see the perlref manpage for
     usage and caveats.

     If	you're planning	on generating new filehandles, you could do this:

	 sub openit {
	     my	$name =	shift;
	     local *FH;
	     return open (FH, $path) ? *FH : undef;
	 }

     Although that will	actually produce a small memory	leak.  See the bottom
     of	the open() entry in the	perlfunc manpage for a somewhat	cleaner	way
     using the IO::Handle package.

     Prototypes    [Toc]    [Back]

     As	of the 5.002 release of	perl, if you declare






								       Page 13






PERLSUB(1)							    PERLSUB(1)



	 sub mypush (\@@)

     then mypush() takes arguments exactly like	push() does.  The declaration
     of	the function to	be called must be visible at compile time.  The
     prototype affects only the	interpretation of new-style calls to the
     function, where new-style is defined as not using the & character.	 In
     other words, if you call it like a	builtin	function, then it behaves like
     a builtin function.  If you call it like an old-fashioned subroutine,
     then it behaves like an old-fashioned subroutine.	It naturally falls out
     from this rule that prototypes have no influence on subroutine references
     like \&foo	or on indirect subroutine calls	like &{$subref}.

     Method calls are not influenced by	prototypes either, because the
     function to be called is indeterminate at compile time, because it
     depends on	inheritance.

     Because the intent	is primarily to	let you	define subroutines that	work
     like builtin commands, here are the prototypes for	some other functions
     that parse	almost exactly like the	corresponding builtins.

	 Declared as		     Called as

	 sub mylink ($$)	     mylink $old, $new
	 sub myvec ($$$)	     myvec $var, $offset, 1
	 sub myindex ($$;$)	     myindex &getstring, "substr"
	 sub mysyswrite	($$$;$)	     mysyswrite	$buf, 0, length($buf) -	$off, $off
	 sub myreverse (@)	     myreverse $a,$b,$c
	 sub myjoin ($@)	     myjoin ":",$a,$b,$c
	 sub mypop (\@)		     mypop @array
	 sub mysplice (\@$$@)	     mysplice @array,@array,0,@pushme
	 sub mykeys (\%)	     mykeys %{$hashref}
	 sub myopen (*;$)	     myopen HANDLE, $name
	 sub mypipe (**)	     mypipe READHANDLE,	WRITEHANDLE
	 sub mygrep (&@)	     mygrep { /foo/ } $a,$b,$c
	 sub myrand ($)		     myrand 42
	 sub mytime ()		     mytime

     Any backslashed prototype character represents an actual argument that
     absolutely	must start with	that character.	 The value passed to the
     subroutine	(as part of @_)	will be	a reference to the actual argument
     given in the subroutine call, obtained by applying	\ to that argument.

     Unbackslashed prototype characters	have special meanings.	Any
     unbackslashed @ or	% eats all the rest of the arguments, and forces list
     context.  An argument represented by $ forces scalar context.  An &
     requires an anonymous subroutine, which, if passed	as the first argument,
     does not require the "sub"	keyword	or a subsequent	comma.	A * does
     whatever it has to	do to turn the argument	into a reference to a symbol
     table entry.






								       Page 14






PERLSUB(1)							    PERLSUB(1)



     A semicolon separates mandatory arguments from optional arguments.	 (It
     is	redundant before @ or %.)

     Note how the last three examples above are	treated	specially by the
     parser.  mygrep() is parsed as a true list	operator, myrand() is parsed
     as	a true unary operator with unary precedence the	same as	rand(),	and
     mytime() is truly without arguments, just like time().  That is, if you
     say

	 mytime	+2;

     you'll get	mytime() + 2, not mytime(2), which is how it would be parsed
     without the prototype.

     The interesting thing about & is that you can generate new	syntax with
     it:

	 sub try (&@) {
	     my($try,$catch) = @_;
	     eval { &$try };
	     if	($@) {
		 local $_ = $@;
		 &$catch;
	     }
	 }
	 sub catch (&) { $_[0] }

	 try {
	     die "phooey";
	 } catch {
	     /phooey/ and print	"unphooey\n";
	 };

     That prints "unphooey".  (Yes, there are still unresolved issues having
     to	do with	the visibility of @_.  I'm ignoring that question for the
     moment.  (But note	that if	we make	@_ lexically scoped, those anonymous
     subroutines can act like closures... (Gee,	is this	sounding a little
     Lispish?  (Never mind.))))

     And here's	a reimplementation of grep:

	 sub mygrep (&@) {
	     my	$code =	shift;
	     my	@result;
	     foreach $_	(@_) {
		 push(@result, $_) if &$code;
	     }
	     @result;
	 }

     Some folks	would prefer full alphanumeric prototypes.  Alphanumerics have
     been intentionally	left out of prototypes for the express purpose of



								       Page 15






PERLSUB(1)							    PERLSUB(1)



     someday in	the future adding named, formal	parameters.  The current
     mechanism's main goal is to let module writers provide better diagnostics
     for module	users.	Larry feels the	notation quite understandable to Perl
     programmers, and that it will not intrude greatly upon the	meat of	the
     module, nor make it harder	to read.  The line noise is visually
     encapsulated into a small pill that's easy	to swallow.

     It's probably best	to prototype new functions, not	retrofit prototyping
     into older	ones.  That's because you must be especially careful about
     silent impositions	of differing list versus scalar	contexts.  For
     example, if you decide that a function should take	just one parameter,
     like this:

	 sub func ($) {
	     my	$n = shift;
	     print "you	gave me	$n\n";
	 }

     and someone has been calling it with an array or expression returning a
     list:

	 func(@foo);
	 func( split /:/ );

     Then you've just supplied an automatic scalar() in	front of their
     argument, which can be more than a	bit surprising.	 The old @foo which
     used to hold one thing doesn't get	passed in.  Instead, the func()	now
     gets passed in 1, that is,	the number of elements in @foo.	 And the
     split() gets called in a scalar context and starts	scribbling on your @_
     parameter list.

     This is all very powerful,	of course, and should be used only in
     moderation	to make	the world a better place.

     Constant Functions    [Toc]    [Back]

     Functions with a prototype	of () are potential candidates for inlining.
     If	the result after optimization and constant folding is either a
     constant or a lexically-scoped scalar which has no	other references, then
     it	will be	used in	place of function calls	made without & or do. Calls
     made using	& or do	are never inlined.  (See constant.pm for an easy way
     to	declare	most constants.)

     All of the	following functions would be inlined.

	 sub pi	()	     { 3.14159 }	     # Not exact, but close.
	 sub PI	()	     { 4 * atan2 1, 1 }	     # As good as it gets,
						     # and it's	inlined, too!
	 sub ST_DEV ()	     { 0 }
	 sub ST_INO ()	     { 1 }





								       Page 16






PERLSUB(1)							    PERLSUB(1)



	 sub FLAG_FOO ()     { 1 << 8 }
	 sub FLAG_BAR ()     { 1 << 9 }
	 sub FLAG_MASK ()    { FLAG_FOO	| FLAG_BAR }

	 sub OPT_BAZ ()	     { not (0x1B58 & FLAG_MASK)	}
	 sub BAZ_VAL ()	{
	     if	(OPT_BAZ) {
		 return	23;
	     }
	     else {
		 return	42;
	     }
	 }

	 sub N () { int(BAZ_VAL) / 3 }
	 BEGIN {
	     my	$prod =	1;
	     for (1..N)	{ $prod	*= $_ }
	     sub N_FACTORIAL ()	{ $prod	}
	 }

     If	you redefine a subroutine which	was eligible for inlining you'll get a
     mandatory warning.	 (You can use this warning to tell whether or not a
     particular	subroutine is considered constant.)  The warning is considered
     severe enough not to be optional because previously compiled invocations
     of	the function will still	be using the old value of the function.	 If
     you need to be able to redefine the subroutine you	need to	ensure that it
     isn't inlined, either by dropping the () prototype	(which changes the
     calling semantics,	so beware) or by thwarting the inlining	mechanism in
     some other	way, such as

	 sub not_inlined () {
	     23	if $];
	 }


     Overriding	Builtin	Functions

     Many builtin functions may	be overridden, though this should be tried
     only occasionally and for good reason.  Typically this might be done by a
     package attempting	to emulate missing builtin functionality on a non-Unix
     system.

     Overriding	may be done only by importing the name from a module--ordinary
     predeclaration isn't good enough.	However, the subs pragma (compiler
     directive)	lets you, in effect, predeclare	subs via the import syntax,
     and these names may then override the builtin ones:

	 use subs 'chdir', 'chroot', 'chmod', 'chown';
	 chdir $somewhere;
	 sub chdir { ... }




								       Page 17






PERLSUB(1)							    PERLSUB(1)



     To	unambiguously refer to the builtin form, one may precede the builtin
     name with the special package qualifier CORE::.  For example, saying
     CORE::open() will always refer to the builtin open(), even	if the current
     package has imported some other subroutine	called &open() from elsewhere.

     Library modules should not	in general export builtin names	like "open" or
     "chdir" as	part of	their default @EXPORT list, because these may sneak
     into someone else's namespace and change the semantics unexpectedly.
     Instead, if the module adds the name to the @EXPORT_OK list, then it's
     possible for a user to import the name explicitly,	but not	implicitly.
     That is, they could say

	 use Module 'open';

     and it would import the open override, but	if they	said

	 use Module;

     they would	get the	default	imports	without	the overrides.

     Note that such overriding is restricted to	the package that requests the
     import.  Some means of "globally" overriding builtins may become
     available in future.

     Autoloading    [Toc]    [Back]

     If	you call a subroutine that is undefined, you would ordinarily get an
     immediate fatal error complaining that the	subroutine doesn't exist.
     (Likewise for subroutines being used as methods, when the method doesn't
     exist in any of the base classes of the class package.) If, however,
     there is an AUTOLOAD subroutine defined in	the package or packages	that
     were searched for the original subroutine,	then that AUTOLOAD subroutine
     is	called with the	arguments that would have been passed to the original
     subroutine.  The fully qualified name of the original subroutine
     magically appears in the $AUTOLOAD	variable in the	same package as	the
     AUTOLOAD routine.	The name is not	passed as an ordinary argument
     because, er, well,	just because, that's why...

     Most AUTOLOAD routines will load in a definition for the subroutine in
     question using eval, and then execute that	subroutine using a special
     form of "goto" that erases	the stack frame	of the AUTOLOAD	routine
     without a trace.  (See the	standard AutoLoader module, for	example.)  But
     an	AUTOLOAD routine can also just emulate the routine and never define
     it.   For example,	let's pretend that a function that wasn't defined
     should just call system() with those arguments.  All you'd	do is this:










								       Page 18






PERLSUB(1)							    PERLSUB(1)



	 sub AUTOLOAD {
	     my	$program = $AUTOLOAD;
	     $program =~ s/.*:://;
	     system($program, @_);
	 }
	 date();
	 who('am', 'i');
	 ls('-l');

     In	fact, if you predeclare	the functions you want to call that way, you
     don't even	need the parentheses:

	 use subs qw(date who ls);
	 date;
	 who "am", "i";
	 ls -l;

     A more complete example of	this is	the standard Shell module, which can
     treat undefined subroutine	calls as calls to Unix programs.

     Mechanisms	are available for modules writers to help split	the modules up
     into autoloadable files.  See the standard	AutoLoader module described in
     the AutoLoader manpage and	in the AutoSplit manpage, the standard
     SelfLoader	modules	in the SelfLoader manpage, and the document on adding
     C functions to perl code in the perlxs manpage.

Contents

NAME [Toc] [Back]

SYNOPSIS [Toc] [Back]

DESCRIPTION [Toc] [Back]

SEE ALSO [Toc] [Back]