Perl Basics 1 -- The Perl Language

Components of a Programming Language

The components of a programming language are:

data types

The data types we are going to deal with most often are numbers and strings. The most common kinds of numbers are integers (1, 2, 3, etc.) and reals (1.1, 2.3, 0. 4, etc.).

A string is a series of zero or more characters, usually enclosed in quotation marks. ("This is a string.", "" is a string of 0 characters).

Characters are not only alphabetical characters, but all those that belong to the character set the computer or program uses (ASCII, Unicode, etc.). Character sets include characters with special meanings that have no alphabetical representation, such as tabs, newlines, quotation marks, etc.

operators

Operators work on data. An operator typically takes one or more data items and produces a result

Since there are different data types, operators may not be appropriate for a data type. (Mathematical operators don't work with strings.) Common operator types are:

variables

Variables are temporary storage places for data. A variable must be explicitly named according to the conventions of the language.

reserved words

Every programming language contains a number of words that have a special meaning within the language syntax as, for example, if, else, while, etc. These words are reserved for the exclusive use of the language and cannot be used by the programmer for other purposes, or a programming error will result and the program won't execute.

data structures

Data structures are collections of data that can be accessed in a certain way. Some data structures are defined within the language, such as various forms of lists (arrays, hash tables, etc.). Other data structures can be provided by programming libraries or defined by the programmer.

functions

Functions, sometimes called subroutines, are blocks of code that perform certain programming tasks. Tasks that a program must perform repeatedly are typically defined as functions. Thus, they are chunks of reusable code that can be 'called' by a program when needed.

Programming languages include sets of internal functions to perform the most common actions and to make the programming easier. Typical actions handled by functions are printing, opening, reading, and closing files, etc. Programmers may also write their own functions. These are called user-defined function in order to distinguish them from internal functions provided by the language.

Functions must be named, and usually their name must be unique. Naming conflicts result in programming errors.

Expressions and Statements

Expressions

An expression is a series of variables, operators, and functions that are evaluated and result in single value. An expression must be constructed according to the syntax of the language.

A program 'works' by executing expressions. Typical expressions are assignments of values to variables, computing a value, or help to control the program flow.

1:  $var    = 5;         # assignment
2:  5 + $var;            # compute a value 
3:  if ($result > $var)  # evaluate to aid flow control

The expressions perform a computation a indicated by its elements and return a single result. The result of the first case is the value of $var, 5. In the second case, it's the expression is equivalent to 10. The result of the third case is 'true' or 'false'.

Compound Expressions

Expressions can be combined in compound expressions as long as the data types required by the operators in the expressions are correct. In order to make it explicit which expression should be evaluated first, sub-expressions should be put in parentheses. Expressions in the innermost parenthese will be evaluated first.

# addition will be computed before multiplication
$var = 5 * (6 + 3); 
Statements

A statement is the equivalent to a sentence in a natural language. In a programming language, a statement is complete unit of execution. It evaluates expressions for side-effects, and the side effects persist after hte statement was executed.

Perl statements must be terminated by a semicolon.

    $var    = 5;                 # assignment statement
    $var++;                      # increment statement (increments $var by 1)
    print "Hello, world!";       # function call
    if ($result > $var) { ...; } # flow control statement
Compound Statements (Statement Blocks)

Statements can be combined in blocks. In fact, some constructs require statements to be placed in blocks. In Perl, statement blocks are enclosed in curly braces, { ... }. The braces ensure that block enclosed will be executed in sequence. The braces also establish the scope of the block. Certain things are only valid within the scope of a block.

Program Structure

Indentation
Perl is mostly a free-form language. There are no rules for indendation or newlines.
Whitespace
White space is required only between items that would otherwise be confused as a single term. Spaces, tabs, newlines, etc., are considered equivalent in this context. They are distinguished, however, in quoted strings and input formatting constructs.
Semicolons
Every simple statement must end with a semicolon. Statement blocks do not require a semicolon after the closing brace.
Declarations
Variables do not need to be declared and can be created any time in a script. Only subroutines (functions) and report formats need declarations.
Comments
A pound sign (#) signifies the beginning of a comment. Everything from the sign to the end of the line will be ignored.