The matching operator, m/ /, does not only accept string literals but also character classes. Character classes are groups of characters that share certain syntactic traits. (Syntactic refers here to programming languages.) There are three predefined classes:
NAME | NOTATION | NEGATION | CHARACTERS |
digits | \d | \D | 0-9 |
word characters | \w | \W | a-z, A-Z, 0-9, _ (underscore) |
whitespace characters | \s | \S | space, tab, newline, return, formfeed |
Negation means the logical opposite: \w refers to all word characters, and \W refers to all non-word characters (which includes punctuation marks and similar characters). It is a convention that
To test whether a string contains any meaningful printed characters, you could check if it contains any non-whitespace characters:
if ($myString =~ m/\S/) { ... do something ... }
This, incidentally, is be equivalent to
Which one to use is a matter of choice. There's always more than one way to do it...if ($myString !~ m/\s/) { ... do something ... }
Perl also lets you define character classes. A typical instance is if you want to change a word with variant spellings. Character classes are defined in square brackets [ . . . ]; any character given within the brackets is part of the class.
The following expression substitutes 'grey' or 'gray' with 'pink':
s/gr[ea]y/pink/g;
Consecutive characters can be expressed as ranges:
a-z ==> any lowercase letterThe previous expression returns true if $line contains at least one 1, 2, 3, or 4.
A-Z ==> any uppercase letter
0-9 ==> any digit
$line =~ m/[1-4]/;
Negation in character classes is expressed by a caret, ^.
$line =~ m/[^1-4]/;This expression returns true if $line contains at least one character that is not a 1, 2, 3, or 4.
Note: The caret is also used as an anchor that symbolizes the beginning of a line.
$line =~ m/^[1-4]/;The meaning of this expression is very different: it returns true if $line starts with a 1, 2, 3, or 4. Run the script.