User-Defined Subroutines or Functions

Functions or subroutines are program units that exist somewhere in your source code, in another module, or are built into Perl. They are written as self-contained blocks of code so that they can be used repeatedly from anywhere in the program whenever they are needed. Since all subroutines return something in Perl, there is no distinction between "subroutines" and "functions," as there may be in other languages. There is only this one subroutine construct (though you are welcome to call them functions, if you like)

print( ) or chomp( ) are library functions that form part of the Perl language. They provide functionalities that are frequently needed. Rather than letting programmers write their own 'print' or 'chomp' routines, library functions have been packaged with the language to help with the most common programming tasks.

There is nothing that prevents programmers to write their own versions of already existing functions. It is usually preferred to use library functions. First, it saves not only development but also debugging time. Library functions have been used by a wider audience in many circumstances. Any bugs are more likely to have been addressed. Furthermore, library functions are usually highly optimized and perform better than any home-grown function written on the fly.

Any block of code that is frequently repeated within a program may be isolated from the main code block and placed into a separate subroutine. Such subroutines are also called user-defined functions (the programmer is the user of the language). Since they are isolated from the main code, they can reused within the same program when needed.

Some languages require that subroutines or functions appear in a specific location. Perl allows them to be defined anywhere. For easier code maintenance, you should group all of them together and place them consistently in one particular place. Some people put them at the top before the main code block, some put them at the end of the program.

Declaring Subroutines

Perl subroutines are declared by the keyword sub, followed by a subroutine name, a parameter list in parentheses, and finally a statement block. The keyword then associates the subroutine name with the block of code that follows. This block of code is only executed when the subroutine is called by name.

A subroutine may or may not accept data as input. Any individual data item passed into a subroutine is called a parameter. Many programming languages require subroutine declarations to include a list of parameters in parentheses that declares the type of data that the subroutine is going to receive. In Perl, this is not required but is encouraged to include a parameter list for clarity, even when the subroutine will not receive any values.

If a parameter list is included, the declaration becomes a general structure. Names of variables included in the parameter list are only placeholders, however. The value passed into the subroutine must still be assigned to new variables in the subroutine body if you do not want to change the values in the original variables. (See passing by reference below.)

Example:

# subroutine with no parameters
sub subroutine_name() 
{
    [ statement block ]
}

# subroutine with parameter list
sub subroutine_name($parameter1, $parameter2, $parameter3, ...) 
{
    [ statement block ]
}

Calling Subroutines

A subroutine can be invoked or called from anywhere in a program or script. In Perl, it is called by prepending its name with an ampersand, &, followed by a list of parameters the subroutine may need in parentheses. A subroutine call passes control of the program to the subroutine, and the statement block associated with the subroutine is executed. When the last statement in the subroutine has been executed, the subroutine exits and program control returns to the main program. A subroutine also terminates if the interpreter encounters the keyword return. When you want your code to return a value, it is more properly called a function.

#subroutine definitions precede the main program block

sub print_subroutine() # subroutine with no parameters
{
    print "Hello, subroutine!\n";
}

sub print_parameter($var) # subroutine with one parameter
{
    $var = shift(@_); # assign parameter to variable
    
    print "Printing subroutine call no: $var\n";
}

# ------------ start main program block ------------ #

&print_subroutine();  # subroutine call

for ($i = 1; $i <= 10; ++$i)
{
   &print_parameter($i); # subroutine call passing a parameter
} 

&print_subroutine();  # subroutine call

If a function returns a value, the function call can be part of an assignment statement.

#function definitions precede the main program block

sub return_max_value($num_1, $num_2) # function prototype with two parameters
{
    ($num_1, $num_2) = @_;
    
    if ($num_1 > $num_2) { return $num_1; }
    else                 { return $num_2; }
}

# --------- start main program --------- #

$returned = &return_max_value(25, 8);  # function call
print "$returned\n";                   # print returned value

Note: There are more ways to call subroutines or functions in Perl (see also Functions Calls, Details). The method described above is the most explicit and will always work, whereas other methods are 'shorthand' versions, so to speak.

Passing parameters into a subroutine

Subroutines may or may not accept data as input. A data item passed into the subroutine is called a parameter.

All subroutine parameters are passed to a subroutine as a single flat list of scalars. They come into the subroutine as the default parameter array, @_.

&mysubroutine("Hello", "world") will become 
@_ = ("Hello", "world"), an array of two elements.


@myArray = (1, 2, 3);
&mysubroutine("Hello", @myArray, "world") will become 
@_ = ("Hello", 1, 2, 3, "world"), a flattened array of five elements.

Hashes, too, become flat arrays:

%myHash = (California => "Sacramento", Wisconsin => "Madison");

&mysubroutine(%myHash) will become 
@_ = ("California", "Sacramento", "Wisconsin", "Madison"), a flattened array of four elements with no 
 distinctions between keys and values except that the keys come before the values.

All values are passed by reference, that is, any operation on values passed as parameters will take place on the original value defined before the subroutine call. If you want keep the original value unchanged, you must assign it to a new variable in the subroutine.

sub increment($var)
{
  # $passed gets incremented
  print "Value in sub increment():                  \$passed = " . ++$_[0] . "\n"; 
}
# (Nb:  For "++" before a variable see reference page on Perl Operators under
# "auto-increment or decrement".)

sub increment_again($var)
{
  # $passed is assigned to local variable, 
  # which gets incremented instead of $passed
  $var = $_[0];
  print "Value in increment_again():                \$var    = " . ++$var . "\n"
}

# -------------- start main ----------------- #
$passed = 1;

print "\nAt start var has initial value:            \$passed = " . $passed . "\n";
&increment($passed) . "\n";
print "Value returned from sub increment():       \$passed = " . $passed . "\n\n";
&increment_again($passed) . "\n";
print "Value returned from increment_again():     \$passed = " . $passed . "\n\n";

# -------------- Output from above ----------------- #
At start var has initial value:            $passed = 1
Value in sub increment():                  $passed = 2
Value returned from sub increment():       $passed = 2

Value in increment_again():                $var    = 3
Value returned from increment_again():     $passed = 2

If you want to keep an array or hash intact and conflate it into the @_ array, you can pass the array or hash to the subroutine as a reference. This is an advanced topic to read up on when you have more experience with Perl.

Returning values from a function

A function may or may not return data. Data is returned by ending the function with an explicit return statement and a list of data. These data constitute the return values of the function.

Just as parameters are passed into a function as a flat array of scalars, so are values returned. However, there is no default array to hold return values. Instead, the return values must be assigned to a variable with the function call.

$returned = &my_function($param, $param);  # return value is a scalar
@returned = &my_function($param [, ... ]); # return value is an array

A function may or may not include an explicit return statement. If return is stated explicitly, any list of values appearing after the statement will be returned as an array and the function will be exited. If an explicit return is not included, the return value will be the result of the last statement of the function.

#function definitions precede the main program block

sub return_max_value($num_1, $num_2) # function prototype with two parameters
{
    ($num_1, $num_2) = @_;
    
    if ($num_1 > $num_2) { return $num_1; }
    else                 { return $num_2; }
}

# --------- start main program --------- #

$returned = &return_max_value(1, 2);  # function call
print $returned;                      # print returned value
print "\n";

Scope

All variables defined in a subroutine become global variables, that is, they become accessible even outside the subroutine. Global variables are a frequent source of program errors. As a code grows and becomes more complex, it also becomes more difficult to maintain an overview of which variables are accessed when. Global variables should be avoided at all cost. If a variable is defined in a subroutine, it can be made private by preceding it with the keyword my.

  my $myVar = 23.5;

This will limit the scope of the variable to the current block of code, for example, a subroutine. The variable will not be accessible from outside the subroutine.

In addition to global and private variables, Perl has variables of dynamic scope, which are preceded by the keyword local.

  local %myHash = (California => "Sacramento", Wisconsin => "Madison");

These variables are accessible within the current code block and all subroutines called within this block.

If your global, local, and private variables have the same name, local variables have precedence over global ones and private variables have precedence over global and local ones within the current block.