DProf

To profile a Perl script run the perl interpreter with the -d switch. So to profile script test.pl with Devel::DProf the following command should be used.

            $ perl5 -d:DProf test.pl

Then run dprofpp to analyze the profile. The output of dprofpp depends on the flags to the program and the version of Perl you're using.

$ perl -d:DProf /usr/local/bin/latex2html perlexamples

$ ls -ltra
..................................................................
-rw-r--r--     1 lhp lhp   44459 2005-04-25 14:07 referencias.tex
drwxr-xr-x     7 lhp lhp    4096 2005-04-25 14:07 .
-rw-r--r--     1 lhp lhp 4354790 2005-04-25 14:08 tmon.out
drwxr-xr-x     2 lhp lhp   20480 2005-04-25 14:08 perlexamples

The *dprofpp* command interprets profile data produced by a profiler, such as the Devel::DProf profiler. Dprofpp will read the file tmon.out and will display the 15 subroutines which are using the most time. By default the times for each subroutine are given exclusive of the times of their child subroutines.

$ dprofpp
File::Glob::GLOB_QUOTE has 1 unstacked calls in outer
File::Glob::GLOB_TILDE has 1 unstacked calls in outer
Exporter::Heavy::heavy_export has 2 unstacked calls in outer
File::Glob::GLOB_ALPHASORT has 1 unstacked calls in outer
File::Glob::GLOB_BRACE has 1 unstacked calls in outer
Exporter::export has -2 unstacked calls in outer
File::Glob::GLOB_NOMAGIC has 1 unstacked calls in outer
File::Glob::AUTOLOAD has -5 unstacked calls in outer
Total Elapsed Time = 31.84545 Seconds
  User+System Time = 17.98545 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 9.88   1.777  5.582  13872   0.0001 0.0004  main::process_command
 8.71   1.567  6.438   9608   0.0002 0.0007  main::translate_environments
 7.66   1.377  1.377   1266   0.0011 0.0011  main::text_cleanup
 5.93   1.067  1.067  11290   0.0001 0.0001  main::iso_map
 4.16   0.748  5.708   9105   0.0001 0.0006  main::translate_commands
 3.58   0.643  0.848    387   0.0017 0.0022  main::add_real_child_links
 3.09   0.556  0.556  11998   0.0000 0.0000  NDBM_File::STORE
 2.91   0.524  0.576   4358   0.0001 0.0001  main::revert_to_raw_tex
 2.80   0.504  0.537  13872   0.0000 0.0000  main::convert_iso_latin_chars
 2.47   0.444  0.458    307   0.0014 0.0015  main::replace_verb_marks
 2.41   0.434  0.434  33015   0.0000 0.0000  NDBM_File::FETCH
 2.37   0.426  1.170    833   0.0005 0.0014  main::replace_sensitive_markers
 2.10   0.378  4.301   9105   0.0000 0.0005  main::real_replace_strange_accents
 2.07   0.373  1.424   6158   0.0001 0.0002  main::accent_safe_for_ij
 1.98   0.356  0.693   5725   0.0001 0.0001  main::mark_string
$

La salida:

    %Time
        Percentage of time spent in this routine.

    #Calls
        Number of calls to this routine.

    sec/call
        Average number of seconds per call to this routine.

    Name
        Name of routine.

    CumulS
        Time (in seconds) spent in this routine and routines called from it.

    ExclSec
        Time (in seconds) spent in this routine (not including those called
        from it).

    Csec/c
        Average time (in seconds) spent in each call of this routine
        (including those called from it).

Opciones:

    -a   Sort alphabetically by subroutine names.
    -d   Reverse whatever sort is used
    -R   Count anonymous subroutines defined in the same package separately.
    -E   (default) Display all subroutine times exclusive of child
         subroutine times.
    -I   Display all subroutine times inclusive of child subroutine times.
    -l   Sort by number of calls to the subroutines. This may help identify
         candidates for inlining.
    -O cnt
         Show only *cnt* subroutines. The default is 15.
    -p script
         Tells dprofpp that it should profile the given script and then
         interpret its profile data. See -Q.
    -Q   Used with -p to tell dprofpp to quit after profiling the script,
         without interpreting the data.
    -q   Do not display column headers.
El siguiente ejemplo ordena la lista por número de llamadas:
$ dprofpp -lq
File::Glob::GLOB_QUOTE has 1 unstacked calls in outer
File::Glob::GLOB_TILDE has 1 unstacked calls in outer
Exporter::Heavy::heavy_export has 2 unstacked calls in outer
File::Glob::GLOB_ALPHASORT has 1 unstacked calls in outer
File::Glob::GLOB_BRACE has 1 unstacked calls in outer
Exporter::export has -2 unstacked calls in outer
File::Glob::GLOB_NOMAGIC has 1 unstacked calls in outer
File::Glob::AUTOLOAD has -5 unstacked calls in outer
 0.00       - -0.000    178        -      -  NDBM_File::FETCH
 0.00       - -0.000    161        -      -  main::normalize
 1.17   0.009  0.065    153   0.0001 0.0004  main::translate_environments
 0.00       - -0.000    145        -      -  main::balance_tags
 2.46   0.019  0.045    133   0.0001 0.0003  main::process_command
 2.46   0.019  0.019    133   0.0001 0.0001  main::convert_iso_latin_chars
 0.00       - -0.000    133        -      -  main::spanish_translation
 0.00       - -0.000    111        -      -  main::make_end_cmd_rx
 0.00       - -0.000     95        -      -  main::replace_cross_ref_marks
 0.00       - -0.000     89        -      -  Getopt::Long::ParseOptionSpec
 0.00       - -0.000     87        -      -  NDBM_File::STORE
 0.00       - -0.001     87        -      -  main::simplify
    -       -  0.008     81        - 0.0001  main::process_group_env
 1.30   0.010  0.010     80   0.0001 0.0001  main::iso_map
 0.00       - -0.000     78        -      -  main::escape_rx_chars
    -r   Display elapsed real times rather than user+system times.
    -s   Display system times rather than user+system times.
    -T   Display subroutine call tree to stdout. Subroutine statistics are
         not displayed.
    -t   Display subroutine call tree to stdout. Subroutine statistics are
         not displayed. When a function is called multiple consecutive times
         at the same calling level then it is displayed once with a repeat
         count.
Veamos un ejemplo, usando estas dos últimas opciones:
$ perl -d:DProf matrixP.pl
Matriz A
1       2       3
2       4       6
3       6       9

Matriz B
1       2
2       4
3       6

Matriz C
14      28
28      56
42      84
$ dprofpp
Total Elapsed Time =   -2e-05 Seconds
  User+System Time =  0.00998 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 0.00       - -0.000      1        -      -  main::matrixProd
 0.00       - -0.000      3        -      -  main::printMat
$ dprofpp -t
main::matrixProd
main::printMat (3x)
$ dprofpp -T
main::matrixProd
main::printMat
main::printMat
main::printMat
$ cat matrixP.pl
#!/usr/bin/perl -w

sub matrixProd {
  my ($matA, $matB) = @_;
  my $i = 0;
  my ($j, $k);
  my $nRowsA = @{$matA};
  my $nColsA = @{$matA->[0]};
  my $nRowsB = @{$matB};
  my $nColsB = @{$matB->[0]};
  my $refMatC;

  if ($nColsA != $nRowsB) {
    die ("Index of two matrices must be agree\n");
  }
  while ($i < $nRowsA)  {
    $j = 0;
    while ($j < $nColsB) {
      $k = 0;
      $refMatC->[$i][$j] = 0;
      while ($k < $nColsA) {
        $refMatC->[$i][$j] += $matA->[$i][$k] * $matB->[$k][$j];
        $k++;
      }
      $j++;
    }
    $i++;
  }
  return $refMatC;
}

sub printMat {
  my $mat = shift;
  my $nRows = @{$mat};
  my $nCols = @{$mat->[0]};
  my $i = 0;

  while ($i < $nRows) {
    $j = 0;
    while ($j < $nCols) {
      print $mat->[$i][$j], "\t";
     $j++;
    }
    print "\n";
    $i++;
  }
}

#
# ---- Main ----
#

$matA = [[1,2,3],[2,4,6],[3,6,9]];
$matB = [[1,2],[2,4],[3,6]];
$matC = matrixProd($matA,$matB);

print "Matriz A \n";
printMat($matA);
print "\nMatriz B \n";
printMat($matB);
print "\nMatriz C \n";
printMat($matC);
Otras opciones:
    -U   Do not sort. Display in the order found in the raw profile.
    -u   Display user times rather than user+system times.
    -V   Print dprofpp's version number and exit. If a raw profile is found
         then its XS_VERSION variable will be displayed, too.
    -v   Sort by average time spent in subroutines during each call. This
         may help identify candidates for inlining.
    -z   (default) Sort by amount of user+system time used. The first few
         lines should show you which subroutines are using the most time.
    -g "subroutine"
         Ignore subroutines except "subroutine" and whatever is called from
         it.
    -G <regexp>
         Aggregate "Group" all calls matching the pattern together. For
         example this can be used to group all calls of a set of packages

           -G "(package1::)|(package2::)|(package3::)"

         or to group subroutines by name:

           -G "getNum"

    -P   Used with -G to aggregate "Pull" together all calls that did not
         match -G.

    -f <regexp>
         Filter all calls matching the pattern.
Veamos otro ejemplo usando la opción -G. Hemos usado la traza de latex2html:
$ dprofpp -G "main" -O 30
File::Glob::GLOB_QUOTE has 1 unstacked calls in outer
File::Glob::GLOB_TILDE has 1 unstacked calls in outer
Exporter::Heavy::heavy_export has 2 unstacked calls in outer
File::Glob::GLOB_ALPHASORT has 1 unstacked calls in outer
File::Glob::GLOB_BRACE has 1 unstacked calls in outer
Exporter::export has -2 unstacked calls in outer
File::Glob::GLOB_NOMAGIC has 1 unstacked calls in outer
File::Glob::AUTOLOAD has -5 unstacked calls in outer
Option G Grouping: [main]
Grouping [main] Calls: [3796]
Grouping [main] Times: [42.4062000000002]
Grouping [main] IncTimes: [170.386800000001]
Total Elapsed Time =  1.05213 Seconds
  User+System Time = 0.772130 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 54.9   0.424  1.704   3796   0.0001 0.0004  main
 3.89   0.030  0.030      8   0.0037 0.0037  L2hos::Unix::BEGIN
 1.30   0.010  0.010      2   0.0050 0.0050  Exporter::as_heavy
 1.30   0.010  0.010      3   0.0033 0.0033  vars::BEGIN
 1.30   0.010  0.010     20   0.0005 0.0005  NDBM_File::TIEHASH
 0.00   0.000  0.000      2   0.0000 0.0000  Exporter::Heavy::heavy_export
 0.00   0.000  0.000      1   0.0000 0.0000  File::Glob::GLOB_BRACE
 0.00   0.000  0.000      1   0.0000 0.0000  File::Glob::GLOB_NOMAGIC
 0.00   0.000  0.000      1   0.0000 0.0000  File::Glob::GLOB_QUOTE
 0.00   0.000  0.000      1   0.0000 0.0000  File::Glob::GLOB_TILDE
 0.00   0.000  0.000      1   0.0000 0.0000  File::Glob::GLOB_ALPHASORT
 0.00       - -0.000      1        -      -  L2hos::Unix::Link
 0.00       - -0.000      1        -      -  L2hos::Unix::pathd
 0.00       - -0.000      1        -      -  L2hos::Unix::dd
 0.00       - -0.000      1        -      -  L2hos::Unix::home
 0.00       - -0.000      1        -      -  L2hos::Unix::fullname
 0.00       - -0.000      1        -      -  Getopt::Long::FindOption
 0.00       - -0.000      1        -      -  L2hos::Unix::syswait
 0.00       - -0.000      1        -      -  L2hos::Unix::path2latex
 0.00       - -0.000      1        -      -  warnings::BEGIN
 0.00       - -0.000      1        -      -  Getopt::Long::ConfigDefaults
 0.00       - -0.000      1        -      -  Getopt::Long::Configure
 0.00       - -0.000      1        -      -  Fcntl::BEGIN
 0.00       - -0.000      1        -      -  Fcntl::bootstrap
 0.00       - -0.000      1        -      -  AnyDBM_File::BEGIN
 0.00       - -0.000      1        -      -  NDBM_File::bootstrap
 0.00       - -0.000      1        -      -  warnings::unimport
 0.00       - -0.000      1        -      -  Cwd::bootstrap
 0.00       - -0.000      1        -      -  Config::TIEHASH
 0.00       - -0.000      1        -      -  Config::import

ENVIRONMENT
    The environment variable DPROFPP_OPTS can be set to a string containing
    options for dprofpp. You might use this if you prefer -I over -E or if
    you want -F on all the time.

    This was added fairly lazily, so there are some undesirable side
    effects. Options on the commandline should override options in
    DPROFPP_OPTS--but don't count on that in this version.

En este enlace encontrará un fichero latex para que pueda repetir el análisis con Devel::DProf:

http://nereida.deioc.ull.es/ pp2/performancebook/proflatex2html.tgz

$perl -d:DProf /usr/local/bin/latex2html perlexamples
....
$ dprofpp
File::Glob::GLOB_QUOTE has 1 unstacked calls in outer
File::Glob::GLOB_TILDE has 1 unstacked calls in outer
Exporter::Heavy::heavy_export has 2 unstacked calls in outer
File::Glob::GLOB_ALPHASORT has 1 unstacked calls in outer
File::Glob::GLOB_BRACE has 1 unstacked calls in outer
Exporter::export has -2 unstacked calls in outer
File::Glob::GLOB_NOMAGIC has 1 unstacked calls in outer
File::Glob::AUTOLOAD has -5 unstacked calls in outer
Total Elapsed Time = 34.27311 Seconds
  User+System Time = 19.94311 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 10.1   2.015  7.150   9734   0.0002 0.0007  main::translate_environments
 8.69   1.733  6.484  14121   0.0001 0.0005  main::process_command
 6.88   1.373  1.373  11469   0.0001 0.0001  main::iso_map
 5.46   1.088  1.088   1270   0.0009 0.0009  main::text_cleanup
 4.70   0.937  6.624   9286   0.0001 0.0007  main::translate_commands
 3.08   0.615  0.968    389   0.0016 0.0025  main::add_real_child_links
 3.01   0.600  0.600  33355   0.0000 0.0000  NDBM_File::FETCH
 2.67   0.532  1.994   6286   0.0001 0.0003  main::accent_safe_for_ij
 2.58   0.515  0.561    311   0.0017 0.0018  main::replace_verb_marks
 2.51   0.500  0.567   4362   0.0001 0.0001  main::revert_to_raw_tex
 2.47   0.492  0.492  12168   0.0000 0.0000  NDBM_File::STORE
 2.35   0.468  0.526  14121   0.0000 0.0000  main::convert_iso_latin_chars
 2.08   0.414 19.990      1   0.4142 19.990  main::translate
 2.06   0.411  0.482   6276   0.0001 0.0001  main::get_next_pair_or_char_pr
 2.04   0.407  5.046      1   0.4071 5.0464  main::post_process
Casiano Rodríguez León
Licencia de Creative Commons
Programación Distribuida y Mejora del Rendimiento
por Casiano Rodríguez León is licensed under a Creative Commons Reconocimiento 3.0 Unported License.

Permissions beyond the scope of this license may be available at http://campusvirtual.ull.es/ocw/course/view.php?id=44.
2012-06-19