The universal debugging tool across nearly all platforms and programming languages is printf( ) (or equivalent output functions). This function can send data to the console, a file, an application window, and so on. In Perl we generally use the print( ) function. With an idea of where and when the bug is triggered, a developer can insert print( )statements into the source code to examine the value of data at certain stages of execution.

However, it is rather difficult to anticipate all the possible directions a program might take and what data might cause trouble. In addition, inline debugging code tends to add bloat and degrade the performance of an application and can also make the code harder to read and maintain. Furthermore, you have to comment out or remove the debugging print( ) calls when you think that you have solved the problem, and if later you discover that you need to debug the same code again, you need at best to uncomment the debugging code lines or, at worst, to write them again from scratch.

The constant pragma helps here. You can leave some debug printings in production code, without adding extra processing overhead, by using constants. For example, while developing the code, you can define a constant DEBUG whose value is 1:

package Foo;
use constant DEBUG => 1;
...
warn "entering foo" if DEBUG;
...

The warning will be printed, since DEBUG returns true. In production you just have to turn off the constant:

use constant DEBUG => 0;

When the code is compiled with a false DEBUG value, all those statements that are to be executed if DEBUG has a true value will be removed on the fly at compile time, as if they never existed. This allows you to keep some of the important debug statements in the code without any adverse impact on performance.

But what if you have many different debug categories and you want to be able to turn them on and off as you need them? In this case, you need to define a constant for each category. For example:

use constant DEBUG_TEMPLATE => 1;
use constant DEBUG_SESSION  => 0;
use constant DEBUG_REQUEST  => 0;

Now if in your code you have these three debug statements:

warn "template" if DEBUG_TEMPLATE;
warn "session"  if DEBUG_SESSION;
warn "request"  if DEBUG_REQUEST;

only the first one will be executed, as it's the only one that has a condition that evaluates to true.

Let's look at a few examples where we use print( ) to debug some problem.

In one of our applications, we wrote a function that returns a date from one week ago. This function (including the code that calls it) is shown in Example 21-4.

Example 21-4. date_week_ago.pl

print "Content-type: text/plain\n\n";
print "A week ago the date was ",date_a_week_ago( ),"\n";

# return a date one week ago as a string in format: MM/DD/YYYY
sub date_a_week_ago {

    my @month_len = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
    my($day, $month, $year) = (localtime)[3..5];

    for (my $j = 0; $j < 7; $j++) {

        $day--;
        if ($day =  = 0) {

            $month--;
            if ($month =  = 0) {
                $year--;
                $month = 12;
            }

            # there are 29 days in February in a leap year
            $month_len[1] =  
                ($year % 400 =  = 0 or ($year % 4 =  = 0 and $year % 100))
                    ? 29 : 28;

            # set $day to be the last day of the previous month 
            $day = $month_len[$month - 1]; 
        }
    }

    return sprintf "%02d/%02d/%04d", $month, $day, $year+1900;
}

This code is pretty straightforward. We get today's date and subtract 1 from the value of the day we get, updating the month and the year on the way if boundaries are being crossed (end of month, end of year). If we do it seven times in a loop, at the end we should get a date from a week ago.

Note that since localtime( ) returns the year as a value of current_year-1900 (which means that we don't have a century boundary to worry about), if we are in the middle of the first week of the year 2000, the value of $year returned by localtime( ) will be 100 and not 0, as one might mistakenly assume. So when the code does $year-- it becomes 99, not -1. At the end, we add 1900 to get back the correct four-digit year format. (If you plan to work with years before 1900, add 1900 to $year before the for loop.)

Also note that we have to account for leap years, where there are 29 days in February. For the other months, we have prepared an array containing the month lengths. A specific year is a leap year if it is either evenly divisible by 400 or evenly divisible by 4 and not evenly divisible by 100. For example, the year 1900 was not a leap year, but the year 2000 was a leap year. Logically written:

print ($year % 400 =  = 0 or ($year % 4 =  = 0 and $year % 100)) 
      ? 'Leap' : 'Not Leap';

Now when we run the script and check the result, we see that something is wrong. For example, if today is 10/23/1999, we expect the above code to print 10/16/1999. In fact, it prints 09/16/1999, which means that we have lost a month. The above code is buggy!

Let's put a few debug print( ) statements in the code, near the $month variable:

sub date_a_week_ago {

    my @month_len = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
    my($day, $month, $year) = (localtime)[3..5];
    print "[set] month : $month\n"; # DEBUG

    for (my $j = 0; $j < 7; $j++) {

        $day--;
        if ($day =  = 0) {

            $month--;
            if ($month =  = 0) {
                $year--;
                $month = 12;
            }
            print "[loop $i] month : $month\n"; # DEBUG

            # there are 29 days in February in a leap year
            $month_len[1] =  
                ($year % 400 =  = 0 or ($year % 4 =  = 0 and $year % 100))
                    ? 29 : 28;

        # set $day to be the last day of the previous month 
            $day = $month_len[$month - 1]; 
        }
    }

    return sprintf "%02d/%02d/%04d", $month, $day, $year+1900;
}

When we run it we see:

[set] month : 9

This is supposed to be the number of the current month (10). We have spotted a bug, since the only code that sets the $month variable consists of a call to localtime( ). So did we find a bug in Perl? Let's look at the manpage of the localtime( ) function:

panic% perldoc -f localtime

Converts a time as returned by the time function to a 9-element array with the time 
analyzed for the local time zone.  Typically used as follows:

  #  0    1    2     3     4    5     6     7     8
  ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);

All array elements are numeric, and come straight out of a struct tm.  In particular 
this means that $mon has the range 0..11 and $wday has the range 0..6 with Sunday as 
day 0.  Also, $year is the number of years since 1900, that is, $year is 123 in year 
2023, and not simply the last two digits of the year.  If you assume it is, then you 
create non-Y2K-compliant programs--and you wouldn't want to do that, would you?
[more info snipped]

This reveals that if we want to count months from 1 to 12 and not 0 to 11 we are supposed to increment the value of $month. Among other interesting facts about localtime( ), we also see an explanation of $year, which, as we've mentioned before, is set to the number of years since 1900.

We have found the bug in our code and learned new things about localtime( ). To correct the above code, we just increment the month after we call localtime( ):

my($day, $month, $year) = (localtime)[3..5];
$month++;

Other places where programmers often make mistakes are conditionals and loop statements. For example, will the block in this loop:

my $c = 0;
for (my $i=0; $i <= 3; $i++) {
    $c += $i;
}

be executed three or four times?

If we plant the print( ) debug statement:

my $c = 0;
for (my $i=0; $i <= 3; $i++) {
    $c += $i;
    print $i+1,"\n";
}

and execute it:

1
2
3
4

we see that it gets executed four times. We could have figured this out by inspecting the code, but what happens if instead of 3, there is a variable whose value is known only at runtime? Using debugging print( )statements helps to determine whether to use < or <= to get the boundary condition right.

Using idiomatic Perl makes things much easier:

panic% perl -le 'my $c=0; $c += $_, print $_+1 for 0..3;'

Here you can plainly see that the loop is executed four times.

The same goes for conditional statements. For example, assuming that $a and $b are integers, what is the value of this statement?

$c = $a > $b and $a < $b ? 1 : 0;

One might think that $c is always set to zero, since:

$a > $b and $a < $b

is a false statement no matter what the values of $a and $b are. But C$ is not set to zero—it's set to 1 (a true value) if $a > $b; otherwise, it's set to undef (a false value). The reason for this behavior lies in operator precedence. The operator and (AND) has lower precedence than the operator = (ASSIGN); therefore, Perl sees the statement like this:

($c = ($a > $b) ) and ( $a < $b ? 1 : 0 );

which is the same as:

if ($c = $a > $b) {
    $a < $b ? 1 : 0;
}

So the value assigned to $c is the result of the logical expression:

$a > $b

Adding some debug printing will reveal this problem. The solutions are, of course, either to use parentheses to explicitly express what we want:

$c = ($a > $b and $a < $b) ? 1 : 0;

or to use a higher-precedence AND operator:

$c = $a > $b && $a < $b ? 1 : 0;

Now $c is always set to 0 (as presumably we intended).[51]

[51]For more traps, refer to the perltrap manpage.