Perl Page #1: An Introduction to Perl-Programming


There's strictly no warranty for the correctness of this text. You use any of the information provided here at your own risk.


Contents:

  1. About Perl
  2. How to execute a Perl Script
  3. Header of a Perl Script
  4. Scalar Variables
  5. Strings
  6. for- and while-Loops
  7. if-Statements
  8. Input on the Command Line. Cookie-Monster Script
  9. Lists / Arrays
  10. Sorting Lists
  11. Hashes
  12. Functions
  13. Using Modules
  14. Regular Expressions (RegEx)
  15. File-Operations
  16. Going Further: References
  17. Anonymous Arrays and Hashes. Arrays of Arrays (AoA), Hashes of Hashes (HoH)
  18. Sorting "Lists of Lists" and Objects
  19. Object-Oriented Programming


1. About Perl

Perl is an interpreted programming language created by Larry Wall. Development began in 1987 (Perl 1.0) and continues until today (2021: Perl 5.34).
Larry Wall is a programmer with a linguistic background. In 1987, he had to process large quantities of text for a project he was working on. It turned out, it was uncomfortable to do that in C, so Larry Wall created a programming language, that made this task easier. Later he released that language to the public.
In Perl ("Practical Extraction and Report Language"), elements of C were combined with those of shells like bash and shell tools like awk and sed.

Perl's mascot is a camel (actually a dromedar) which is also depicted on the book "Programming Perl" (described in "perldoc perlbook").
While the executable of the interpreter is called "perl" (in lower-case letters), the language itself is called "Perl" (with an upper-case letter).

Perl follows the concept of "There's more than one way to do it." (TMTOWTDI).
It was designed similar to principles of natural (human) languages.
So when writing in Perl, the programmer has a relatively wide range of expressing himself compared to other programming languages.
Perl doesn't give developers many directives, how to write their code. As a result, Perl code written by different people can look rather different, and it may be difficult to understand the Perl code written by somebody else.
Some say, Perl code also looks ugly.

Perl provides flexible, dynamic datatypes: There are

  • Scalar-variables ("$a"): These contain texts or numbers (integers or floating-point numbers which are automatically converted),
  • Arrays (lists) ("@a"): These contain several scalars adressed by their position in the list ("$a[4]") and
  • Hashes ("%a"): These contain several scalars, adressed by key-words defined by the programmer ("$a{somekey}").
  • These datatypes are quite easy to handle, and Perl also takes care of memory-management automatically. The price for that is, that Perl-code usually runs slower than C-code doing the same thing. It is also not possible to automatically translate Perl-code to C first and then compile and run it as a C-program, just in case you wonder.

    Some time ago, Perl was used as a language for controlling internet websites, using the modul "CGI". Today, PHP and Python may be used more often for this task.

    Object oriented programming is only possible since Perl 5, which was originally released in 1994. So before that it wasn't available in Perl. As it was implemented later, it may feel like something, that has been put on top of the original concept.

    Python on the other hand had object oriented programming from the beginning. Python code also looks much cleaner than Perl code, as Perl uses sigils like "$", "@", "%" and curly brackets "{ ... }" all the time, while Python does without them. Therefore many developers prefer writing larger software projects in Python. Perl still has its strengths when used for smaller scripts though.

    There's also a large number of modules for almost all tasks available for Perl, that can be found in the "CPAN" ("Comprehensive Perl Archive Network").

    While Perl 5 was designed by Larry Wall, the next Perl version (Perl 6) was supposed to be designed by the internet community. Discussion started in 2000, but didn't get on well and got stuck for many years. It turned out, Perl 6 became a different language than Perl (5). In 2020, "Perl 6" was renamed to "Raku".
    Perl 7 once will be a closer to Perl 5 again.

    I really like the book about Perl 5 called "Sams Teach Yourself Perl in 21 Days" by Laura Lemay (2002). The German translation is called "Perl in 21 Tagen". There are several books with this title, be sure to get the one written by Laura Lemay.

    Help can also be found using the "perldoc"-shell-command. It is run with an argument, for example

    perldoc perlintro
    Many of these arguments are listed in
    man perl

    After getting to know Perl a little better, I found especially these pages useful:

    perldoc perlrun
    perldoc perlfunc
    perldoc perlvar
    perldoc Tk
    perldoc Tk::UserGuide
    perldoc perltoc

    If you use the "-f" option of "perldoc", you can get information on a certain function directly, for example

    perldoc -f split

    More books and tutorials about Perl are mentioned in

    perldoc perlfaq2


    2. How to execute a Perl Script

    Executing a Perl-script isn't difficult: Copy and paste its code into a text-editor.
    On Windows, "Notepad.exe" will do, but you can also use an IDE ("Integrated Development Environment") like for example "Geany".
    On Linux, I suggest learning how to use one of the editors "vim" or "emacs", but "kate", "gedit" or "kwrite" will do, too.
    Then save the script as for example "script.pl".
    On Windows, you can then run the script by double-clicking it. If it's a GUI-application, you may want to get rid of the DOS-box that is usually opened. You can achieve this by creating a link to your script and use "C:\Perl\bin\wperl.exe" as its application.
    In a DOS-box you can run your script by typing

    perl script.pl

    in its directory. On Linux, you have to make your script executable first. You can do that by executing

    chmod +x script.pl
    After that you can run it by doing
    ./script.pl

    in its directory.

    Notice, that the script is compiled into some kind of bytecode first and only executed, after the compilation has been completed successfully. This process can take a little time (usually just a few milliseconds).

    It is also possible to run small pieces of Perl-code, socalled "one-liners", in a Linux-shell with perl's "-e"-switch, for example:

    perl -e 'print "Hello World\n";'


    3. Header of a Perl Script

    A Perl script has to be told, where the Perl interpreter can be found. This is done by a special line of code (shown below).
    The modules "warnings" and "strict" should be used. They make sure, that the programmer gets debugging information. Every script should be written in a way, that it runs without errors, while these modules are active. That's why a Perl script should have this header:

    #!/usr/bin/perl
    
    use warnings;
    use strict;


    4. Scalar Variables

    Perl (we're always talking about Perl 5 here) uses the same kind of variable for a number as well as for a string. Such a variable is called a "scalar variable". It has a sigil "$" before it. The keyword "my" needs to be used to define such a variable. At the end of Perl statements, there's a semicolon (like at the end of C statements). So scalar variables can be defined in a script like this:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $a = 10;
    my $b = "Hello";
    print "$a\n";
    print "$b\n";

    The "\n" is the newline character (on Linux). It can be used in "print" commands. Recent Perl 5 distributions also have a command "say", that prints statements automatically with a newline character.


    5. Strings

    You can store a string in a scalar variable. Here's an example what can be done with strings:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $a = "Hello";
    my $b = "World";
    
    # Strings can be concatanated using the "."-operator:
    my $c = $a . " " . $b;
    
    # Scalar variables in quotation marks are expanded,
    # similar to the expansion mechanism of shells:
    print "$c\n";
    
    # You can get the length of a string like this:
    my $d = length($c);
    print "$d\n";
    
    # Substrings of a string can be extracted:
    my $e = substr($a, 0, 2);
    print "$e\n";
    
    # The pattern "rl" is found at this position in $a:
    print index($c, "rl") . "\n";
    

    Search and replace-operations are done using "regular expressions", that are described later.
    Some more:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    print "'a' has an ASCII-code of " . ord("a") . ".\n";
    print "The character with ASCII-code 69 is '" . chr(69) . "'.\n";
    print "\n";
    
    # If you want to get rid of the last character of a string,
    # you can use "chop()":
    my $a = "streets";
    chop($a);
    print "$a\n";
    
    # "chomp()" does the same, but it only works on a "\n" at the end:
    my $b = "Hello!\n";
    print "$b";
    chomp($b);
    
    # The "x"-operator multiplies strings (the number has to be behind it):
    my $c = "-" x 54;
    print "$c\n";
    


    6. for- and while-Loops

    Here's an example how to create for- and while-loops in Perl:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $i;
    
    # A for-loop can be created similar to those in C:
    for ($i=1; $i<=10; $i++) {
        print "$i\n";
    }
    print "\n";
    
    # Another method is even more common in Perl:
    
    for $i (1 .. 10) {
        print "$i\n";
    }
    print "\n";
    
    # while-loops are as well possible:
    
    $i = 1;
    while ($i <= 10) {
        print "$i\n";
        $i++;
    }

    $i++; is the same as

    $i = $i + 1;

    "$i += 1;" is also possible.

    Note: "for $i (0 .. 10) {}" also includes the 10. This is different to "for i in range(10)" in Python, which doesn't.


    7. if-Statements

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $a = 1;
    my $b = 2;
    
    # Equal numbers:
    if ($a == 1) {
        print "\$a is 1.\n";
    }
    
    # Not equal numbers:
    if ($a != 10) {
        print "\$a is not 10.\n";
    }
    
    # Greater than:
    if ($b > $a) {
        print "\$b is greater than \$a.\n";
    }
    
    # Smaller than and logical AND:
    if (($a < $b) && ($b == 2)) {
        print "\$a is less than \$b, and \$b is 2.\n";
    }
    
    # Logical OR:
    if (($a < $b) || ($b == 1)) {
        print "\$a is less than \$b, or \$b is 1.\n";
    }
    
    # if, else:
    if ($a == 10) {
        print "\$a is 10.\n";
    } else {
        print "\$a is not 10.\n";
    }
    
    # if, elsif, else:
    if ($a == 5) {
        print "\$a is 5.\n";
    } elsif ($a == 1) {
        print "\$a is 1.\n";
    } else {
        print "\$a is not 5 and not 1.\n";
    }
    
    # Strings: If you want to test strings, you have to use
    # "eq" and "ne" instead of "==" and "!=":
    
    my $c = "hello";
    
    if ($c eq "hello") {
        print "\$c is 'hello'.\n";
    }
    
    if ($c ne "hello") {
        print "\$c is not 'hello'.\n";
    }


    8. Input on the Command Line. Cookie-Monster Script

    This is a nice example script from the book of Laura Lemay, I mentioned above:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    # Cookie Monster
    
    my $cookies = "";
    while ($cookies ne "COOKIES") {
        print 'I want COOKIES: ';
        $cookies = <STDIN>;
        chomp($cookies);
    }
    
    print "Mmmm. COOKIES.\n";
    

    The line

    $cookies = <STDIN>;

    reads input from the command line and assigns it to a (scalar-)variable. "STDIN" is a so-called "handle" for the standard input. Programs have a standard input "STDIN" from where they read, and a standard output "STDOUT" to where they write, for example, when something is printed. (And there's also a handle "STDERR", to which error messages are printed.)
    Here, "STDIN" could have been even left out, like this:

    $cookies = <>;

    because Perl assumes, that STDIN is meant, when using this "diamond-operator" without anything in between.

    The function "chomp()" cuts a final newline character from a string, if there is one at its end. In the script, the command with the diamond operator receives input including the newline character at the end, so this character has to be cut off from the variable again using "chomp()".
    There is also a function "chop()", that cuts any last character from a string.


    9. Lists / Arrays

    Lists contain a number of elements. In Perl, lists are stored in array variables, that have a sigil "@". Here's an example, what can be done with arrays:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my @a = ("apple", "banana");
    
    # Add an element at the end of the list:
    push(@a, "peach");
    
    # Create a loop, that iterates through the list (printing it).
    # The "for" is short for "foreach" here:
    my $i;
    for $i (@a) {
        print "$i\n";
    }
    print "\n";
    
    # Remove the last element of the list and return it:
    my $l = pop(@a);
    print "$l\n";
    
    # Remove the first element of the list and return it:
    my $f = shift(@a);
    print "$f\n";
    
    # Add an element at the beginning of the list:
    unshift(@a, "peach");
    unshift(@a, "apple");
    
    print "\n";
    
    # Iterate through the list again:
    for $i (@a) {
        print "$i\n";
    }
    print "\n";
    
    # Get the number of elements of the list minus 1:
    my $n = $#a;
    print "$n\n\n";
    
    # Iterate over the element numbers:
    for $i (0 .. $#a) {
        print "$i\n";
    }
    print "\n";
    
    # Extract an element in the middle of the list:
    my $e = splice(@a, 1, 1);
    print "$e\n\n";
    
    # Add an element in the middle of the list:
    splice(@a, 1, 0, "cherry");
    
    # Iterate through the list again:
    for $i (@a) {
        print "$i\n";
    }
    print "\n";
    
    # Access an element by element number.
    # Notice that the element numbers are in the range from 0 to the number of elements minus 1,
    # so "$a[1]" is the second element of the list:
    print $a[1] . "\n";
    $a[1] = "strawberry";
    print $a[1] . "\n";
    

    Arrays with single word elements can be defined quicker using the "qw" (= "quote word") operator. Quation marks and commas can then be left out. So, instead of:

    my @a = ("apple", "banana", "peach");

    you can also write:

    my @a = qw(apple banana peach);

    split() and join(): A string can be splitted into a list at a given pattern using the function "split()".
    A list can be joined together to a string with a given pattern using the function "join()":

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $a = "This is a line of text.";
    
    # We create a list "@b" by splitting "$a" at " ":
    my @b = split(" ", $a);
    
    my $i;
    for $i (@b) {
        print "$i\n";
    }
    print "\n";
    
    # We join the elements of "@b" together to a string "$c",
    # using ";" as the connecting string:
    
    my $c = join(";", @b);
    print "$c\n";

    Notes on the functions "split()" and "join()":


    10. Sorting Lists

    You can sort a list "@a" just by doing:

    @a = sort(@a);

    In most cases this is enough, and you can stop reading here.



    Just as a reference for myself, I write down several more advanced Perl sorting operations here:

    1. List-sort by string:

    my @l = qw(b c e d f a);
    @l = sort {$a cmp $b} @l;

    The "$a" and "$b" inside the sort-line are special variables, internal to Perl.

    2. List-sort by number:

    my @l = qw(2 3 5 4 6 1);
    @l = sort {$a <=> $b} @l;

    3. Reverse sort:

    my @l = qw(b c e d f a);
    
    # By string:
    @l = sort {$b cmp $a} @l;
    
    # By number:
    @l = sort {$b <=> $a} @l;


    11. Hashes

    A hash is a kind of dictionary. It hold pairs of "keys" and "values". It has a a sigil "%".
    The keys of a hash can be written with quotation marks like other strings. But in this case, the quotation marks can also exceptionally be left out.

    The function "keys()" returns the keys of the hash as an array. Then you can iterate over that array to access all pairs of "keys" and "values" of the hash. There's also a corresponding function "values()", but it isn't used that often.
    The function "exists()" tells you, if a value exists in a hash.
    Notice, that unlike lists, the pairs in a hash are not sorted. You can sort the array returned by the "keys()" function though, using the "sort()" function.

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    # A hash can be defined like this:
    my %h = (name => "Steve",
             age  => 29,
             job  => "doctor");
    
    # Print all keys and values of the hash,
    # using the functions "keys()" and "sort()":
    my $i;
    for $i (sort(keys(%h))) {
        print "$i \t $h{$i}\n";
    }
    
    print "\nThis is the value for 'name':\n";
    print $h{name} . "\n";
    
    print "\nSteve changes his job.\n";
    $h{job} = "architect";
    print "His job is now: " . $h{job} . ".\n";
    

    Programming beginners tend to avoid hashes at first. But they are especially useful, when you want to count the occurrences of elements in a list. Or when you want to remove multiple occurrences of elements from a list (and you don't care about the order of the elements). Here's an example:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my @a = qw(apple banana peach apple banana apple banana apple peach);
    
    my %h;
    my $i;
    
    # Build a hash to count the numbers of occurrences of the elements:
    for $i (@a) {
        if (exists($h{$i})) {
            $h{$i}++;
        } else {
            $h{$i} = 1;
        }
    }
    
    for $i (sort(keys(%h))) {
        print "$i \t $h{$i}\n";
    }
    
    print "\n";
    
    # The keys of %h are the single elements of @a without multiple occurrences:
    my @b = keys(%h);
    @b = sort(@b);
    for $i (@b) {
        print "$i\n";
    }


    12. Functions

    Functions are written with the keyword "sub", followed by the name of the function and a code block (in curly brackets).
    Different from C, the parameters of the function are not written into the function declaration. Instead, every function knows a special array variable called "@_", that holds all the arguments to the function.
    Array operations at the beginning of the function without other arguments refer to that function. Therefore it is common practice to extract the arguments as scalar variables from the array "@_" using the "shift()" function at the beginning the function. So you'd write for example:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    sub printArguments {
       my $first_argument  = shift;
       my $second_argument = shift;
    
       print "$first_argument\n";
       print "$second_argument\n";
    
    }
    
    printArguments("Hello", "World");

    That way, the parameters are written at the beginning of the function, but not in the function declaration itself.

    Passing an array to a function: The easiest way to pass a mix of scalar- and array variables to a function, is to keep the order of the arguments in mind, cut the scalar variables from the array "@_" using "shift()" and pass the rest of that array to the array parameter:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    sub printArguments {
       my $first_argument  = shift;
       my $second_argument = shift;
       my @an_array        = @_;
    
       print "$first_argument\n";
       print "$second_argument\n";
       my $i;
       for $i (@an_array) {
           print "$i\n";
       }
    
    }
    
    my @arr = qw(A list of more words);
    printArguments("Hello", "World", @arr);

    When you pass several arrays, they'll get "flattened" into that single one in the function. You could for example pass the number of elements of each array too, so you could separate the arrays from the single one in the function again.

    Passing a hash to a function: Hashes can also be passed to functions. The first element of "@_" is then interpreted as the first key of the hash, the second element is interpreted as the first value of the hash and so on:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    sub printArguments {
       my %h = @_;
       my $i;
       for $i (sort(keys(%h))) {
           print "$i \t $h{$i}\n";
       }
    }
    
    my %h_out = (a    => 10,
                 b    => "hello",
                 test => "nothing");
    printArguments(%h_out);

    In a similar way, you have to return values from a function: You can either return a single scalar variable or one array holding several values. Returning a hash would be interpreted as returning one array as well.

    If you want to pass several arguments, that are supposed to stay separated, you have to use references.

    Functions used to have their own sigil "&". Today, you don't write that sigil, when calling a function.


    13. Using Modules

    Modules are external pieces of code that can be imported into your script. There are thousands of modules on the CPAN for almost any Perl programming task.
    Modules can be downloaded and installed from the CPAN. But Linux distributions also already come with many Perl modules, prepared in packages. So if a Perl module, you want to use, is provided in a distribution package, you should install it from there.
    There are also some core modules, that come with Perl itself. "Cwd" is for example such a module. You can use it to get the pathname of the current working directory. You can get information on modules by executing "perldoc" with the module's name, so you get information on "Cwd" by running:

    perldoc Cwd

    If a module is installed, you can import it, by writing "use" in your script followed by the module's name (and the semicolon at the line's end). After that, the functions and variables provided by the module are available in your script. For example, this is how "Cwd" can be used:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    use Cwd;
    
    my $cwd = getcwd();
    print "$cwd\n";

    The function "getcwd()" is not available in custom Perl. It was provided by the module.


    14. Regular Expressions (RegEx)

    "Regular expressions" are used to do search and replace operations in strings. Here's a basic example:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $a = "Hello Work";
    
    # Searching:
    if ($a =~ /ell/) {
        print "'ell' is in \$a.\n";
    }
    
    # Replacing:
    $a =~ s/Work/World/;
    print "$a\n";

    Programmers used to write a "m" before the first slash of the search operation, like a "s" is still needed before the first slash of the replace operation. But today, the "m" is mostly omitted and only two slashes are used to express searching.

    When searching, you can use "^" at the beginning of the RegEx to look for the substring only at the beginning of the string. And you can use "$" at the end of the RegEx to look for the substring only at the end of the string.

    In a RegEx, "." is a placeholder for any character. If you really want to search for a dot (and not for "any character"), you have to put a backslash in front of the dot, like "\.".
    There are also multipliers: "*" searches for an unlimited number of repetitions of the characters before it (including none). So ".*" means "An infinite number of any character, including none", while ".+" means "An infinite number of any character, but at least one".
    Again, if you want to search only for a star, you have to put a backslash in front of the character like "\*" (this is a general rule).

    Square brackets represent character classes. For example, "[0-9]" searches for a number.
    When used in square brackets, "^" means "not", so "[^0-9]" searches for a character, that is not a number.

    You can also use scalar variables inside regular expressions, for example "if ($a =~ /$b/) { ... }". But when you write it like this, the string inside $b also becomes expanded as a regular expression. If you just want to search for the literal string inside $b, you have to put the variable between "\Q" and "\E", like this:

    if ($a =~ /\Q$b\E/) { ... }

    When searching with placeholders, regular expressions tend to become "greedy" (which is something, I'm just mentioning; you should read more about it somewhere else, if this happens in your script).

    The replacing operation shown above only replaces the first occurrence of the RegEx. If you put a "g" behind the last slash (this is called using a "flag"), every occurence of the RegEx is replaced.
    By default, Perl's replace operations are case sensitive. You can use the "i" flag to turn case sensitivity off.

    Sometimes, you want to extract the string, a RegEx has found. This can be done by putting the part of the RegEx you want to store in round brackets. When the search (or replace) operation is successful, the result is stored in temporary variables called $1, $2 and so on. From there, you can then fetch the substring:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $a = "Hello World";
    
    my $b;
    
    if ($a =~ /(o Wo)/) {
        $b = $1;
        print "$b\n";
    }

    There are many larger documentations about regular expressions, beginning with:

    perldoc perlretut
    So I only explain the most important rules here. Although the RegEx language is quite powerful, with that many slashes, backslashes and different types of brackets, RegEx code can become very cryptic and hardly readable. So I use it only, when I have to.


    15. File-Operations

    This script demonstrates some often-used file-operations. For security-reasons I comment the lines accessing the disk out. If you want to test the script including writing and deleting, remove the comment-sign ("#") from the beginning of the two relevant lines in the script:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    use Cwd;
    
    # Get the current working directory:
    my $PATH = getcwd();
    my $fh;
    
    my $fname = "$PATH/myfile.txt";
    
    # Test, if file exists:
    if(-e $fname) {
        print "File already exists.\n";
        exit(1);
    }
    
    print "Writing to file.\n";
    # open($fh, ">", $fname) or die $!;
    
    print $fh "This is a line of text.\n";
    print $fh "This is another line.\n";
    
    close($fh);
    
    print "Reading from file:\n";
    open($fh, "<", $fname) or die $!;
    
    my @a = <$fh>;
    
    close($fh);
    my $i;
    for $i (@a) {
        print "$i\n";
    }
    
    print "\nDeleting the file.\n\n";
    # unlink $fname;
    
    print "The directory contains:\n";
    
    my @b = <*>;
    for $i (@b) {
        print "$i\n";
    }


    16. Going Further: References

    References are relatively complicated. A beginner can probably do without them. So this chapter may also be skipped, until it is maybe needed later.

    With Perl 5.0, in 1994 more advanced constructions like references and objects were added to the language.

    A reference is a scalar-variables, that points to another variable. That other variable can be another scalars, an array, a hash, a larger data-structure or an object.
    You can take a reference on an array, by writing a backslash in front of the sigil, like that:

    my $aref = \@a;

    In a similar way you can take references on hashes ("\%h") or on other scalar variables ("\$a");

    These references can then be passed to functions, keeping larger variables separated in the function array "@_".

    To go back from a reference to the related variable, the reference has to be "dereferenced". This is done by writing the sigil of the related variable first, and then in curly brackets the name of the reference. Like this:

    my @a = @{$aref};

    Yes, that can become quite ugly and complicated. So, here's an example:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    sub printArguments {
       my $scalar_var      = shift;
       my $array_ref       = shift;
    
       # Dereferencing the reference:
       my @an_array = @{$array_ref};
    
       print "$scalar_var\n";
       my $i;
       for $i (@an_array) {
           print "$i\n";
       }
    
    }
    
    my @a = qw(apple banana peach);
    my $aref = \@a;
    printArguments("Hello", $aref);

    When you have a reference to an array (or to a hash), you can also access the elements of that array (or hash) directly through the reference without dereferencing it. This is done by using the "->"-operator. Another example:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    my $i;
    
    # An array:
    my @a = qw(one two three);
    
    # A reference to that array:
    my $aref = \@a;
    
    # Dereferencing $aref:
    my @b = @{ $aref };
    
    # Output of @b:
    for $i (@b) {
        print "$i\n";
    }
    print "\n";
    
    # Getting the number of elements via the reference:
    print $#{$aref} + 1;
    print "\n";
    
    # Elements can be accessed directly via the reference:
    print $aref->[0] . "\n"; 
    print $aref->[1] . "\n"; 
    print $aref->[2] . "\n"; 
    print "\n";
    
    # This is also an often used technique:
    # Managing an array, using a reference to an anymous array:
    my $aref2 = ["four", "five", "six"];
    print $aref2->[0] . "\n"; 
    print $aref2->[1] . "\n"; 
    print $aref2->[2] . "\n"; 
    print $#{$aref2} + 1;
    print "\n\n";
    
    # Same for hashes: A hash:
    my %h = (first  => "one",
             second => "two",
             third  => "three");
    
    # A reference to that hash:
    my $href = \%h;
    
    # Dereferencing $href:
    my %b = %{ $href };
    
    # Output of @b:
    for $i (keys(%b)) {
        print "$i\t$b{$i}\n";
    }
    print "\n";
    
    # Elements can be accessed directly via the reference:
    print $href->{first} . "\n"; 
    print $href->{second} . "\n"; 
    print $href->{third} . "\n"; 
    print "\n";
    
    # Managing a hash, using a reference to an anymous hash:
    my $href2 = { fourth => "four",
                  fifth  => "five",
                  sixth  => "six" };
    
    print $href2->{fourth} . "\n"; 
    print $href2->{fifth} . "\n"; 
    print $href2->{sixth} . "\n";

    As references are convenient to pass larger data structures to functions, modules often expect them as arguments to their functions. It's not difficult to use these functions though. If you want to pass a reference to an array instead of the array itself to a function, you just write for example

    dothis(\@a);

    instead of

    dothis(@a);

    The function "ref()" takes a reference as an argument, and returns a string "SCALAR", "ARRAY" or "HASH", if the reference points to a variable of such a type.
    You could also call "ref(\@a)" to find out, if "@a" is an array, but you probably know already, because the sigil shows it.

    It is also possible to take a reference on a function. As mentioned above, functions have this sigil "&", so a reference to a function "wanted()" would be "\&wanted". The function "find()" of the module "File::Find" expects such a reference to a function called "\&wanted" as an argument. You can read more about this in "perldoc File::Find".


    17. Anonymous Arrays and Hashes. Arrays of Arrays (AoA), Hashes of Hashes (HoH)

    The key to understanding these larger data structures are references. So please read the part about them first.

    If you have an array "@a", then "@a" is the name of the array.
    Suppose, you'd have an array without such a name. It would just look something like this: ("a", "b", "c"). Such an array exists in Perl, you just have to write it in square brackets: ["a", "b", "c"].
    How can it be accessed? Well, with a reference pointing to it:

    my $aref = ["a", "b", "c"];
    print $aref->[1] . "\n";

    What is that good for? You can hold several of these anonymous arrays inside another array. That actually gives you multidimensional arrays (which you can't have otherwise in Perl).

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    # A multidimensional array ("Array of Array"):
    my @a = (["First_one",  "First_two"],
             ["Second_one", "Second_two"]);
    
    # Accessing single elements:
    print $a[0][1] . "\n";
    print $a[1][1] . "\n";
    
    print "\n";
    
    # If you want to loop through all elements, you have to
    # dereference the inner arrays:
    my ($i, $u);
    my @inner_array;
    for $i (@a) {
        @inner_array = @{ $i };
        for $u (@inner_array) {
            print "$u\n";
        }
        # Or without the extra variable, like this:
        for $u (@{ $i }) {
            print "$u\n";
        }
    }
        
    print "\n";
    
    # A multidimensional hash ("Hash of Hash"):
    my %persons = ( nr1 => {name => "Bob",
                            age  => 21},
                    nr2 => {name => "Richard",
                            age  => 30} );
    
    print $persons{nr2}{name} . "\n";
    print "\n";
    
    # Output of the whole hash:
    for $i (keys(%persons)) {
        for $u (keys(%{$persons{$i}})) {
            print $persons{$i}{$u} . "\n";
        }
        print "\n";
    } 
    
    # As you see, this can get rather complicated and ugly quickly. 
    # To get an output of such data-structures in a more convenient way,
    # there's the module "Data::Dumper":
    
    use Data::Dumper;
    print Dumper(%persons);
    
    # Anonymous Hashes and Arrays can also be mixed (in every possible way,
    # actually):
    
    my %items = ( Bob     => ["House", "Car"],
                  Richard => ["Hut", "Bicycle"] );
    
    for $i (@{$items{Richard}}) {
        print "$i\n";
    }
    print "\n";
    print Dumper(%items);

    Maybe you can see, that it's not a big step from these larger data-structures to classes and objects anymore.


    18. Sorting "Lists of Lists" and Objects

    Now we can add two more sorting methods used for larger data structures:

    4. Sorting "lists of lists" by an element inside one of the lists:

    my @l = ([8, 2], [5, 1]);
    @l = sort {$a->[1] <=> $b->[1]} @l;
    
    # Output:
    my @inside_list;
    for my $i (@l) {
       @inside_list = @{$i};
       for my $u (@inside_list) {
           print "$u\t";
       }   
       print "\n";
    }

    This sorts by "element one" inside the lists, that is "2" and "1" here.

    5. Sorting lists of objects by an attribute inside the objects:

    #!/usr/bin/perl
    
    use warnings;
    use strict;
    
    package ListElement {
    
        sub new {
            my $classname = shift;
            my $self = {attribute => shift};
            return bless($self, $classname);
        }
    }
    
    sub printList {
        my @l = @_;
        for my $i (@l) {
            print $i->{attribute} . "\n";
        }
        print "\n";
    }
    
    my @l = ();
    my @numbers = qw(3 2 5 7 1 8);
    my $i;
    my $le;
    for $i (@numbers) {
        $le = ListElement->new($i);
        push(@l, $le);
    } 
    printList(@l);
    
    @l = sort {$a->{attribute} <=> $b->{attribute}} @l;
    
    printList(@l);


    19. Object-Oriented Programming

    Object-Oriented Programming in Perl is a topic of its own.
    You can read about it on the next page of my little Perl series "Perl Page #2: Object-Oriented Programming in Perl Made Easy".



    Email: hlubenow2 {at-symbol} gmx.net

    Back to main-page