Next: par2cube, Previous: ppwupcc, Up: Command Reference
ppwrun is a program that allows you to easily control PPW's runtime performance data recording options, which are otherwise manually set via environment variables. To use ppwrun, prefix your normal program invocation command line with the ppwrun command with any of the options listed below, and the appropriate environment variables will be set.
For example, if you would like to gather profile information and PAPI hardware counter information about your UPC program a.out, and you normally execute that program using upcrun, you might do this instead:
$ ppwrun --output=aoutprof.par --profile \ --papi-metrics=PAPI_TOT_CYC \ upcrun -n 128 ./a.out
Alternatively, if you'd like to collect trace data for a sequential program a.out, you might do this:
$ ppwrun --output=aouttrace.par --trace \ ./a.out
The slashes in the example commands above are used to break each example shell command across multiple lines and not actually part of the command itself.
To invoke ppwrun, use the following syntax:
ppwrun [--help] [--output=file] [--disable|--trace] [--trace-handling=MODE] [--disable-throttling] [--throttling-count=count] [--throttling-duration=duration] [--selective-file=file] [--comm-stats|--line-comm-stats] [--bash|--tcsh] upcrun...|a.out...
ppwrun accepts the following options:
In centralized mode, all threads process their trace data in parallel, then master will collects trace data from each thread and writes it to a file. Suited for distributed shared-memory clusters.
In distributed mode, all threads process their trace data in parallel, then each node will write its trace data to the par file. The master node will assist in synchronization between different nodes. Suited for multi-core shared-memory machines.
In reduced mode, all threads process and write their trace data in a
sequential manner. Master will assist in synchronization between threads. This
mode should be used with clusters with slow IO. The amount of disk IO is minimum
in this mode.
THROTTLING:
When throttling is not disabled (This option is not used); PPW determines high
frequency, short duration user level events and stops measuring them once it
crosses couple of throttling thresholds. An event is eligible for throttling
if it is invoked more than throttling-count (can be set by –throttling-count)
times and the execution time for that event is less than throttling-duration
(can be set by –throttling-duration)
ppwrun will also accept each command with a single dash instead of two, so you can type
$ ppwrun -trace ...
instead of
$ ppwrun --trace ...
If your parallel job spawner does not propagate environment variables for you, then you may experience problems with ppwrun. Symptoms of this problem will be apparent because you will not be able to collect trace data for your applications and any option you give to ppwrun will seem to be ignored.
If this is the case, then you'll need to include the shell commands printed by the --bash or --csh options into your shell's profile file. This file is usually .bash_profile or .cshrc; consult your shell's documentation or your local sysadmin guru for more information.
For UPC programs, PPW does not currently support noncollective UPC exits, such
as an exit on one thread that causes a SIGKILL
signal to be sent to
other threads. As an example, consider the following UPC program:
... int main() { if (MYTHREAD) { upc_barrier; } else { exit(0); } return 0; }
In this program, depending on the UPC compiler and runtime system used, PPW may not write out valid performance data for all threads. A future version of PPW may add “dump” functionality where complete profile data is flushed to disk every N minutes, which will allow you to collect partial performance data from a long-running program that happens to crash a few minutes before it is completed. However, for technical reasons PPW will generally not be able to recover from situations like these, so please do try to debug any crashes in your program before analyzing it with PPW.
When you run your application, you may run into error messages like the following one:
PPW warning: no source information available
PPW stores a snapshot of your application's source code in a file archive with the extension .ppw.sar. If you move your program's executable and do not move this file to the same directory, you will get this error message whenever you run your program. To fix this problem, keep a copy of the .ppw.sar file in the same directory as your compiled program.
If you'd like to test which recording options are dictated by your current environment variable settings, use the ppw-showopts command. As an example (but keep in mind output will vary from machine to machine) using csh(1)-compatible shell syntax:
% ppwrun -trace -output=foo.par -csh setenv PPW_TRACEMODE 1 setenv PPW_OUTPUT foo.par % setenv PPW_TRACEMODE 1 % setenv PPW_OUTPUT foo.par % ppw-showopts Current PPW configuration options (in directory /storage/home/leko): + Disabled? 0 + Communication stats? 0 + Communication stats per line? 0 + Tracing? 1 + Trace buffer size? 16384 + Output? foo.par + PAPI metrics? (none)
And the same example using bash(1)-compatible syntax:
$ ppwrun -trace -output=foo.par -bash export PPW_TRACEMODE=1 export PPW_OUTPUT=foo.par $ export PPW_TRACEMODE=1 $ export PPW_OUTPUT=foo.par $ ppw-showopts Current PPW configuration options (in directory /storage/home/leko): + Disabled? 0 + Communication stats? 0 + Communication stats per line? 0 + Tracing? 1 + Trace buffer size? 16384 + Output? foo.par + PAPI metrics? (none)
To see which environment variables are set by ppwrun, use the --csh and --bash options.