As we saw earlier, it's almost always possible to debug code with the help of print( ). However, it is impossible to anticipate all the possible paths of execution through a program, and difficult to know what code to suspect when trouble occurs. In addition, inline debugging code tends to add bloat and degrade the performance of an application, although most applications offer inline debugging as a compile-time option to avoid these performance hits. In any case, this information tends to be useful only to the programmer who added the print( )statements in the first place.

Sometimes you must debug tens of thousands of lines of Perl in an application, and while you may be a very experienced Perl programmer who can understand Perl code quite well just by looking at it, no mere mortal can even begin to understand what will actually happen in such a large application until the code is running. So to begin with you just don't know where to add your trusty print( )statements to see what is happening inside.

The most effective way to track down a bug is often to run the program inside an interactive debugger. Most programming languages have such tools available, allowing programmers to see what is happening inside an application while it is running. The basic features of any interactive debugger allow you to:

It takes practice to learn the most effective ways of using an interactive debugger, but the time and effort will be paid back many times in the long run.

Perl comes with an interactive debugger calledperldb. Giving control of your Perl program to the interactive debugger is simply a matter of specifying the -d command-line switch. When this switch is used, Perl inserts debugging hooks into the program syntax tree, but it leaves the job of debugging to a Perl module separate from the Perl binary itself.

We will start by reviewing a few of the basic concepts and commands provided by Perl's interactive debugger. These examples are all run from the command line, independent of mod_perl, but they will still be relevant when we work within Apache.

It might be useful to keep the perldebug manpage handy for reference while reading this section, and for future debugging sessions on your own.

The interactive debugger will attach to the current terminal and present you with a prompt just before the first program statement is executed. For example:

panic% perl -d -le 'print "mod_perl rules the world"'

Loading DB routines from perl5db.pl version 1.0402

Emacs support available.

Enter h or `h h' for help.

main::(-e:1):   print "mod_perl rules the world"
  DB<1>

The source line shown is the line that Perl is about to execute. To single step—i.e., execute one line at a time—use the next command (or just n). Each time you enter something in the debugger, you must finish by pressing the Return key. This will cause the line to be executed, after which execution will stop and the next line to be executed (if any) will be displayed:

main::(-e:1):   print "mod_perl rules the world"
  DB<1> n
mod_perl rules the world
Debugged program terminated.  Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.
DB<1>

In this case, our example code is only one line long, so we have finished interacting after the first line of code is executed. Let's try again with a slightly longer example:

my $word = 'mod_perl';
my @array = qw(rules the world);

print "$word @array\n";

Save the script in a file called domination.pl and run it with the -d switch:

panic% perl -d domination.pl

main::(domination.pl:1):      my $word = 'mod_perl';
  DB<1> n
main::(domination.pl:2):      my @array = qw(rules the world);
  DB<1>

At this point, the first line of code has been executed and the variable $word has been assigned the value mod_perl. We can check this by using the p (print) command:

main::(domination.pl:2):      my @array = qw(rules the world);
  DB<1> p $word
mod_perl

The print command is similar to Perl's built-in print( ) function, but it adds a trailing newline and outputs to the $DB::OUT file handle, which is normally opened on the terminal from which Perl was launched. Let's continue:

  DB<2> n
main::(domination.pl:4):      print "$word @array\n";
  DB<2> p @array
rulestheworld
  DB<3> n
mod_perl rules the world
Debugged program terminated.  Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.

Unfortunately, p @array printed rulestheworld and not rules the world, as we would prefer, but that's absolutely correct. If you print an array without expanding it first into a string it will be printed without adding the content of the $" variable (otherwise known as $LIST_SEPARATOR, if the English pragma is being used) between the elements of the array.

If you type:

print "@array";

the output will be rules the world, since the default value of the $" variable is a single space.

You should have noticed by now that there is some valuable information to the left of each executable statement:

main::(domination.pl:4):      print "$word @array\n";
  DB<2>

First is the current package name (in this case, main::). Next is the current filename and statement line number (domination.pl and 4, in this example). The number presented at the prompt is the command number, which can be used to recall commands from the session history, using the ! command followed by this number. For example, !1 would repeat the first command:

panic% perl -d -e0

main::(-e:1):   0
  DB<1> p $]
5.006001
  DB<2> !1
p $]5.006001
  DB<3>

where $] is Perl's version number. As you can see, !1 prints the value of $], prepended by the command that was executed.

Notice that the code given to Perl to debug (with -e) was 0—i.e., a statement that does nothing. To use Perl as a calculator, and to experiment with Perl expressions, it is common to enter perl -de0, and then type in expressions and p (print) their results.

Things start to get more interesting as the code gets more interesting. In the script in Example 21-5, we've increased the number of source files and packages by including the standard Symbol module, along with an invocation of its gensym( ) function.

Example 21-5. test_sym.pl

use Symbol ( );

my $sym = Symbol::gensym( );

print "$sym\n";

Now let's debug it:

panic% perl -d test_sym.pl

main::(test_sym.pl:3):      my $sym = Symbol::gensym( );
  DB<1> n
main::(test_sym.pl:5):      print "$sym\n";
  DB<1> n
GLOB(0x80c7a44)

Note that the debugger did not stop at the first line of the file. This is because use ... is a compile-time statement, not a runtime statement. Also notice there was more work going on than the debugger revealed. That's because the next command does not enter subroutine calls, it steps over. To step intosubroutine code, use the step command (or its abbreviated form, s):

panic% perl -d test_sym.pl

main::(test_sym.pl:3):      my $sym = Symbol::gensym( );
  DB<1> s
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:86):
86:         my $name = "GEN" . $genseq++;
  DB<1>

Notice the source line information has changed to the Symbol::gensym package and the Symbol.pm file. We can carry on by hitting the Return key at each prompt, which causes the debugger to repeat the last step or next command. It won't repeat a print command, though. The debugger will eventually return from the subroutine back to our main program:

  DB<1> 
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:87):
87:         my $ref = *{$genpkg . $name};
  DB<1> 
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:88):
88:         delete $$genpkg{$name};
  DB<1> 
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:89):
89:         $ref;
  DB<1> 
main::(test_sym.pl:5):      print "$sym\n";
  DB<1> 
GLOB(0x80c7a44)

Our line-by-line debugging approach has served us well for this small program, but imagine the time it would take to step through a large application at the same pace. There are several ways to speed up a debugging session, one of which is known as setting a breakpoint.

The breakpoint command (b) is used to tell the debugger to stop at a named subroutine or at any line of any file. In this example session, at the first debugger prompt we will set a breakpoint at the Symbol::gensym subroutine, telling the debugger to stop at the first line of this routine when it is called. Rather than moving along with next or step, we give the continue command (c), which tells the debugger to execute the script without stopping until it reaches a breakpoint:

panic% perl -d test_sym.pl

main::(test_sym.pl:3):      my $sym = Symbol::gensym( );
  DB<1> b Symbol::gensym
  DB<2> c
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:86):
86:         my $name = "GEN" . $genseq++;

Now let's imagine we are debugging a large application where Symbol::gensym might be called from any one of several places. When the subroutine breakpoint is reached, by default the debugger does not reveal where it was called from. One way to find out this information is with the stack Trace command (T):

  DB<2> T
$ = Symbol::gensym( ) called from file `test_sym.pl' line 3

In this example, the call stack is only one level deep, so only that call is printed. We'll look at an example with a deeper stack later. The leftmost character reveals the context in which the subroutine was called. $ represents scalar context; in other examples you may see @, which represents list context, or ., which represents void context. In our case we called:

my $sym = Symbol::gensym( );

which calls the Symbol::gensym( ) in scalar context.

Now let's make our test_sym.pl example a little more complex. First, we add a Book::World1 package declaration at the top of the script, so we are no longer working in the main:: package. Next, we add a subroutine named do_work( ), which invokes the familiar Symbol::gensym, along with another function called Symbol::qualify, and then returns a hash reference of the results. The do_work( ) routine is invoked inside a for loop, which will be run twice. The new version of the script is shown in Example 21-6.

Example 21-6. test_sym2.pl

package Book::World2;

use Symbol ( );

for (1, 2) {
    do_work("now");
}

sub do_work {
    my($var) = @_;

    return undef unless $var;

    my $sym  = Symbol::gensym( );
    my $qvar = Symbol::qualify($var);

    my $retval = {
        sym => $sym,
        var => $qvar,
    };

    return $retval;
}
1;

We'll start by setting a few breakpoints, then we'll use the List command (L) to display them:

panic% perl -d test_sym2.pl

Book::World2::(test_sym2.pl:5):   for (1, 2) {
  DB<1> b Symbol::qualify
  DB<2> b Symbol::gensym
  DB<3> L
/usr/lib/perl5/5.6.1/Symbol.pm:
 86:        my $name = "GEN" . $genseq++;
   break if (1)
 95:        my ($name) = @_;
   break if (1)

The filename and line number of the breakpoint are displayed just before the source line itself. Because both breakpoints are located in the same file, the filename is displayed only once. After the source line, we see the condition on which to stop. In this case, as the constant value 1 indicates, we will always stop at these breakpoints. Later on you'll see how to specify a condition.

As we will see, when the continue command is executed, the execution of the program stops at one of these breakpoints, at either line 86 or line 95 of the file /usr/lib/perl5/5.6.1/Symbol.pm, whichever is reached first. The displayed code lines are the first line of each of the two subroutines from Symbol.pm. Breakpoints may be applied only to lines of runtime-executable code—you cannot, for example, put breakpoints on empty lines or comments.

In our example, the List command shows which lines the breakpoints were set on, but we cannot tell which breakpoint belongs to which subroutine. There are two ways to find this out. One is to run the continue command and, when it stops, execute the Trace command we saw before:

  DB<3> c
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:86):
86:         my $name = "GEN" . $genseq++;
  DB<3> T
$ = Symbol::gensym( ) called from file `test_sym2.pl' line 14
. = Book::World2::do_work('now') called from file `test_sym2.pl' line 6

So we see that this breakpoint belongs to Symbol::gensym. The other way is to ask for a listing of a range of lines from the code. For example, let's check which subroutine line 86 is a part of. We use the list (lowercase!) command (l), which displays parts of the code. The list command accepts various arguments; the one that we want to use here is a range of lines. Since the breakpoint is at line 86, let's print a few lines around that line number:

  DB<3> l 85-87
85      sub gensym ( ) {
86=  =>b      my $name = "GEN" . $genseq++;
87:         my $ref = *{$genpkg . $name};

Now we know it's the gensym subroutine, and we also see the breakpoint highlighted with the = =>b markup. We could also use the name of the subroutine to display its code:

  DB<4> l Symbol::gensym
85      sub gensym ( ) {
86=  =>b      my $name = "GEN" . $genseq++;
87:         my $ref = *{$genpkg . $name};
88:         delete $$genpkg{$name};
89:         $ref;
90      }

The delete command (d) is usedto remove a breakpoint by specifying the line number of the breakpoint. Let's remove the first one we set:

DB<5> d 95

The Delete (with a capital D) command (D) removes all currently installed breakpoints.

Now let's look again at the trace produced at the breakpoint:

  DB<3> c
Symbol::gensym(/usr/lib/perl5/5.6.1/Symbol.pm:86):
86:         my $name = "GEN" . $genseq++;
  DB<3> T
$ = Symbol::gensym( ) called from file `test_sym2.pl' line 14
. = Book::World2::do_work('now') called from file `test_sym2.pl' line 6

As you can see, the stack trace prints the values that are passed into the subroutine. Ah, and perhaps we've found our first bug: as we can see from the first character on the second line of output from the Trace command, do_work( ) was called in void context, so the return value was discarded. Let's change the for loop to check the return value of do_work( ):

for (1, 2) {
    my $stuff = do_work("now");
    if ($stuff) {
        print "work is done\n";
    }
}

In this session we will set a breakpoint at line 7 of test_sym2.pl, where we check the return value of do_work( ):

panic% perl -d test_sym2.pl

Book::World2::(test_sym2.pl:5):   for (1, 2) {
  DB<1> b 7
  DB<2> c
Book::World2::(test_sym2.pl:7):       if ($stuff) {
  DB<2>

Our program is still small, but already it is getting more difficult to understand the context of just one line of code. The window command (w)[54] will list a few lines of code that surround the current line:

[54]In Perl 5.8.0 use l instead of w, which is used for watch-expressions.

  DB<2> w
4         
5:        for (1, 2) {
6:          my $stuff = do_work("now");
7=  =>b       if ($stuff) {
8:              print "work is done\n";
9           }
10        }
11        
12        sub do_work {
13:         my($var) = @_;

The arrow points to the line that is about to be executed and also contains a b, indicating that we have set a breakpoint at this line.[55]

[55]Note that breakable lines of code include a colon (:) immediately after the line number.

Now, let's take a look at the value of the $stuff variable:

  DB<2> p $stuff
HASH(0x82b89b4)

That's not very useful information. Remember, the print command works just like the built-in print( ) function. The debugger's x command evaluates a given expression and pretty-prints the results:

  DB<3> x $stuff
0  HASH(0x82b89b4)
   'sym' => GLOB(0x826a944)
      -> *Symbol::GEN0
   'var' => 'Book::World2::now'

Things seem to be okay. Let's double check by calling do_work( ) with a different value and print the results:

  DB<4> x do_work('later')
0  HASH(0x82bacc8)
   'sym' => GLOB(0x818f16c)
      -> *Symbol::GEN1
   'var' => 'Book::World2::later'

We can see the symbol was incremented from GEN0 to GEN1 and the variable later was qualified, as expected.[56]

[56]You won't see the symbol printout with Perl 5.6.1, but it works fine with 5.005_03 or 5.8.0

Now let's change the test program a little to iterate over a list of arguments held in @args and print a slightly different message (see Example 21-7).

Example 21-7. test_sym3.pl

package Book::World3;

use Symbol ( );

my @args = qw(now later);
for my $arg (@args) {
    my $stuff = do_work($arg);
    if ($stuff) {
        print "do your work $arg\n";
    }
}

sub do_work {
    my($var) = @_;

    return undef unless $var;

    my $sym = Symbol::gensym( );
    my $qvar = Symbol::qualify($var);

    my $retval = {
        sym => $sym,
        var => $qvar,
    };

    return $retval;
}
1;

There are only two arguments in the list, so stopping to look at each one isn't too time-consuming, but consider the debugging pace if we had a large list of 100 or so entries. Fortunately, it is possible to customize breakpoints by specifying a condition. Each time a breakpoint is reached, the condition is evaluated, stopping only if the condition is true. In the session below, the window command shows breakable lines. The = =>symbol shows us the line of code that's about to be executed.

panic% perl -d test_sym3.pl

Book::World3::(test_sym3.pl:5): my @args = qw(now later);
  DB<1> w
5=  =>    my @args = qw(now later);
6:      for my $arg (@args) {
7:          my $stuff = do_work($arg);
8:          if ($stuff) {
9:              print "do your work $arg\n";
10          }
11      }
12         
13         sub do_work {
14:            my($var) = @_;

We set a breakpoint at line 7 with the condition $arg eq 'later'. As we continue, the breakpoint is skipped when $arg has the value of now but not when it has the value of later:

  DB<1> b 7 $arg eq 'later'
  DB<2> c
do your work now
Book::World3::(test_sym3.pl:7):     my $stuff = do_work($arg);
  DB<2> n
Book::World3::(test_sym3.pl:8):     if ($stuff) {
  DB<2> x $stuff
0  HASH(0x82b90e4)
   'sym' => GLOB(0x82b9138)
      -> *Symbol::GEN1
   'var' => 'Book::World3::later'
  DB<5> c
do your work later
Debugged program terminated.  Use q to quit or R to restart,

You should now understand enough about the debugger to try many other features on your own, with the perldebug manpage by your side. Quick online help from inside the debugger is available by typing the h command, which will display a list of the most useful commands and a short explanation of what they do.

Some installations of Perl include a readline module that allows you to work more interactively with the debugger—for example, by pressing the up arrow to see previous commands, which can then be repeated by pressing the Return key.