CS10100 String Functions

Perl String Functions

String: Changing Case

There are four functions that change the case of a string:

lc( )      -- changes a string to lower case

lcfirst( ) -- changes the first character of a string to lower case

uc( )      -- changes a string to upper case

ucfirst( ) -- changes the first character of a string to lower case

The functions take a string as an argument and return an altered version of it.

Example:

$myVar = "hello world!";

print uc($myVar) . "\n"; # will print "HELLO WORLD!"

print ucfirst($myVar) . "\n"; # will print "Hello world!"

String: Length and Substrings

To get the length of a string, simply call the length( ) function; it returns an integer value which represents the number of characters:

$num = length("this is a string");
print $num . "\n";

will print 16.

Accessing a substring of a string is done through the substr( ) function. This function takes a list of three arguments: a string literal or variable, an integer that represents the offset of the first character of the substring, and an integer that specifies the length of the substring.

print substr("Hello",2,2) . "\n";

will print "ll".

Like arrays, strings are indexed starting with 0. Hence, a substring of length 1 starting a index 0

print substr("Hello",0,1) . "\n";

will print the first character, "H".

If the last integer is omitted, the remaining string is returned, starting from the offset.
Negative numbers start counting from the end of the string.

split(/ pattern /, string)

split( ) takes two arguments: a character pattern and a string; it returns an array.

The function searches the string for the occurrences of the pattern. If it encounters one, it puts the string section before the pattern into an array and continues to search the string until it reaches the end of the string. Every time the pattern is found, the string preceding it is pushed onto the array. The pattern is discarded.

When the function reaches the end of the string, whatever is left of the string and is not the pattern will be pushed onto the array. If there was no match, the entire string will be the only element of the returned array. (That is, the length of the array will be 1.)

Example Script

Copy the text of the script and run it in Perl. The pattern is now set to a space character. Change it to different patterns to see how the function behaves.

# enter 'control c' to quit the script

print "\n\n>>> "; # this is just decoration to give the user a prompt
while (<STDIN>)
{
     # strip input of newline character
     chomp($_);

     # put input into var to free default input space
     $input = $_;

     # now split the input line according to
     # the pattern in '/ /', which is a space character
     @splitArray = split(/ /, $input);

     print "\n";
     # print the array returned after splitting
     foreach $element (@splitArray)
     {
          print $element . "\n";
     }
     print "\n>>> "; # user prompt again
}

join("string", list)

You may also want to do the reverse: taking an array or list of values and turning them into a single string separated by a character or symbol. This can be achieved with join(), which takes two arguments: a character string and a list; it returns a single string that contains each item of the list separated by the string of the function's first argument.

@myArray = ("Tom", "Dick", "Harry");
$s       = join(':', @myArray);
print $s;

will return

Tom:Dick:Harry

The following script will accept keyboard input, split it on a space character, and join the resulting array on a colon. (It's not a very efficient way of replacing white space characters, but it serves to demonstrate the point.)

# enter 'control c' to quit the script

print "\n\n>>> "; # this is just decoration to give the user a prompt
while (<STDIN>)
{
     # strip input of newline character
     chomp($_);

     # put input into var to free default input space
     $input = $_;

     # now split the input line according to
     # the pattern in '/ /', which is a space character
     @splitArray = split(/ /, $input);
     
     $joined       = join(':', @splitArray);


     print "\n";
     # print the joined array
     print $joined . "\n";
     
     print "\n>>> "; # user prompt again
}

chop( ) and chomp( )

If chop( ) is given a string, it removes the last character of a string. chomp( ) removes the last character ONLY if it is a newline character.

$str1 = "Hello";
$str2 = "Hello\n";

print "\n";

print 'Length of "Hello"   is:  ' . length($str1) . "\n"; # length = 5
print 'Length of "Hello\n" is:  ' . length($str2) . "\n"; # length = 6

print "chop, chop...\n";
# chop() removes the last character from a string
chop($str1);              
chop($str2);
print 'Length of "Hello"   is:  ' . length($str1) . " = $str1\n"; # length = 4
print 'Length of "Hello\n" is:  ' . length($str2) . " = $str2\n"; # length = 5

print "\n\n";

$str1 = "Hello";
$str2 = "Hello\n";

print 'Length of "Hello"   is:  ' . length($str1) . "\n"; length = 5
print 'Length of "Hello\n" is:  ' . length($str2) . "\n"; length = 6

# chop() removes only the last newline character from a string
print "chomp, chomp...\n";
chomp($str1);
chomp($str2);
print 'Length of "Hello"   is:  ' . length($str1) . " = $str1\n"; length = 5
print 'Length of "Hello\n" is:  ' . length($str2) . " = $str2\n"; length = 5

print "\n";

Many times you will want to remove the newline character at the end of a text line. For this, chomp() is a safer method since it only removes newline characters and leaves other characters alone. It returns the number of characters removed.

If chop( ) and chomp( ) are given an array of strings, they remove the last character or newline character, respectively, of each string in the array.

print "\n";

@arr1 = ("aa", "ba", "ca");
@arr2 = ("aa\n", "ba\n", "ca\n");

print "@arr1\n";    # result: aa ba ca

print "@arr2\n";    # result: aa
                    #          ba
                    #          ca

print "chop, chop...\n";
chop(@arr1) ."\n";  # result: a b c
chop(@arr2) ."\n";  # result: aa ba ca

print "@arr1\n";
print "@arr2\n";

print "\n";

@arr1 = ("aa", "ba", "ca");
@arr2 = ("aa\n", "ba\n", "ca\n");

print "@arr1\n";    # result: aa ba ca

print "@arr2\n";    # result: aa
                    #          ba
                    #          ca

print "chomp, chomp...\n";
chomp(@arr1) ."\n";  # result: aa ba ca
chomp(@arr2) ."\n";  # result: aa ba ca

print "@arr1\n";
print "@arr2\n";


print "\n";