Index of /download/dist/harness
NOTE: This document has NOT been kept up-to-date as new features
have been added. It should still be mostly correct, but is incomplete.
The UPC test harness scripts
============================
Files:
* harness.pl = the main test harness PERL script
* runjobs.pl = the script that runs the test applications
* sysconf = template for a harness system configuration
* alvarez = sample system configuration file for alvarez.nersc.gov
* seaborg = sample system configuration file for seaborg.nersc.gov
* flyer = sample system configuration file for flyer.cse.mtu.edu
The intent of the harness is to automate the compilation and
execution of the test codes in the UPC test suites. Of the
various test suites in the bupc-testsuite directory, currently
only "gwu" and "mupc" have been configured to work with this harness.
The harness requires a certain amount of system configuration in
order to work. Template configuration files are provided alvarez
and seaborg. In addition, various command line options can be
specified to over-ride configuration file values.
In particular, the harness gets the following info from the
system configuration file:
harness_dir: where does the test harness live?
testsuite_dir: where is the bupc-testsuite?
testsuites: which suites (subdirs) of bupc-testsuite to use?
logroot: where to place the logs, reports, batch scripts, etc.
network: which network (e.g. GASNet conduit) to use?
batch_sys: which batch system to use (pbs, loadleveler or interactive)
queues: which batch queues to use, what are their properties
and in what order of preference?
repository: which repository to charge when running in the batch
system? (may be ignored on systems that don't charge
for time).
nthread_default: Number of UPC threads to use when running a test.
Individual tests can over-ride this default.
max_proc_per_node: The maximum number of processes per node that
will be used when running the test.
Note: read the header in the system configuration file to understand
the format. Basically, you can construct structures similar
to PERL data structures. Note also that strings of the form
%NAME% will be replaced by ENV{NAME} when the file is read.
Command line options:
-nocompile Do not compile the test suite.
-norun Generate the run scripts, but do not submit them to run.
-sysconf=file [REQUIRED] Specify the system configuration file.
-threads=N Specify default number of UPC threads.
-pthreads=N Specify number of pthreads per process.
-ppn=N Specify max number of processes per node.
-network=s Specify the network [e.g. GASNet conduit].
-suite=name Specify name of test suite(s) to run.
[NOTE: May be a comma seperated list or this option
may be specified multiple times]
-repo=name Specify name accounting repository.
-filepat=string Specify a filename pattern. Only tests in the
suite(s) that match this pattern will be compiled/run.
-runlimit=N Default runtime limit (in seconds) for each test
-features=str Comma-separated list of features supported by compilation environment
Set automatically when running within upcr
-add_feature Comma-separated list of features to be added without
removing any automatically set -features= list
-del_feature Comma-separated list of features to be removed
Mixed -add_feature and -del_feature are processed in order
Note that some of these values (threads, ppn, network, suite, repo, upccdir)
will over-ride the values in the system configuration file.
What does harness.pl do?
=======================
(1) reads the system configuration file and parses the command line
options.
(2) constructs a date and timestamped subdirectory of logroot. The
directory name will be of the form YYYYMMDD_HHMMSS. This directory
will contain files named:
log = harness log file. Look here if problems occur.
compile.rpt = status of each test_app compliation
run.rpt = status of each test_app execution
qscript_XXX = automatically generated batch queue script used to
run a set of test_apps.
runlist_XXX = list of test_apps that have yet to be run by
qscript_XXX.
(2) for each specified test suite, the harness will:
(a) read the harness configuration file (harness.conf). This file
specifies what compilation/run tests to perform and how to
determine success/failure of each test.
(b) for each test in the harness:
* compile the test (unless the -nocompile command line
flag was specified) and search for the strings
"error" and "warning" coming back to stdout/stderr.
* record the success/failure in the compile.rpt file.
* if the test was supposed to pass compilation and did, then
submit the test to a runlist. The harness will select a
list based on the total number of processes required to
run the application. The harness computes the number
of processes required by dividing the number of UPC threads
by the number of pthreads (if specified) then selects a batch
queue to use, and decides on the number of processes that
will run on each node, and the number of nodes to use.
All the jobs requiring the same number of processes will be
put on the same runlist. A self-submitting queue script
will be generated for each runlist.
(3) After all suites have been processes, the harness will submit
each of the queue scripts to the batch system, unless the
-norun command line option was specified.
(4) At some point, the job is run. The queue script executes the
"runjobs.pl" PERL script with the runlist as one of its arguments.
It also informs the runjobs.pl the total number of seconds allowed
by this queue. The runjobs.pl script selects applications from the
runlist that are expected to complete before the queue limits expire.
The applications are executed under a watchdog, that will kill the
job if it exceeds its specified time limit. After the application
completes, runjobs.pl determine the success or failure of the run
and writes an entry into the "run.rpt" file.
If the runjobs.pl script is running out of time, without running
all the jobs in the queue, it terminates with a special error code.
The queue script captures this error code and re-submits itself
to be run again if not all the apps were processed.
The Testsuite configuration files:
=================================
Each testsuite requires a configuration file named "harness.conf".
As an example, here is part of harness.conf for the gwu suite:
BEGIN_DEFAULT_CONFIG
Flags:
Files: $TESTNAME$.upc
DynamicThreads: $DEFAULT$
StaticThreads: $DEFAULT$
CompileResult: pass
PassExpr: ^Success:
FailExpr: ^Failure:
ExitCode: 0
BuildCmd: upcc
AppArgs:
TimeLimit: $DEFAULT$
END_DEFAULT_CONFIG
# ------------------------------------------------------------
WildCard: <*>.upc
# ------------------------------------------------------------
TestName: I_case_i
CompileResult: fail
# ------------------------------------------------------------
TestName: I_case1_ii
CompileResult: fail
Each test is defined by a series of stanzas beginning with the
name "TestName". All subsequent stanzas define the attributes
for this test. A special set of stanzas, between the
BEGIN_DEFAULT_CONFIG and END_DEFAULT_CONFIG markers will
apply to all tests, unless specifically re-defined by the
test.
In the above example, test I_case_i will inherit all the
attributes from the default section, except that the
value of CompileResult will be replaced with 'fail'.
Note the "WildCard" stanza is special, and allows
file globing to auto-generate test configurations rather
than having to list them all. In this case, all files
in the directory with of the form *.upc will generate
a test configuration. The test configuration name
will be generated by stripping the '.upc' from the
name. For example, the file 'foo.upc' will generate a config
named 'foo'. Note that the angle brackets in the wildcard
expression determine which portion of the name to keep.
Stanza definitions:
* BuildCmd: specify whether to use 'make' or 'upcc'
* Flags: additional flags to be passed to upcc or make
* Files: list of files that need to be compiled in order
to build the app. This is only needed if using
upcc directly.
* CompileResults: Is the test supposed to compile (pass) or is
it a negative test, and supposed to generate
a compiler error (fail).
* DynamicThreads: list of UPC thread counts. The application will
be run once for each thread number. Note that
a dynamic-thread binary will not be built if
the only number in the list is zero.
* StaticThreads: list of UPC thread counts. A static-thread
binary will be compiled for each value. No
static binaries will be built if the only
number in the list is zero.
* PassExpr: A PERL regular expression. 0 => ignore. See below.
* FailExpr: A PERL regular expression. 0 => ignore. See below.
* ExitCode: The expected exit code when running the application
or the string 'ignore'.
* RunCmd: command to use for running application, defaults to upcrun
can be preceded by an optional "feature-expression ;"
* RunCmdArgs: additional arguments to pass to RunCmd
* AppArgs: run-time arguments needed by the application
can be preceded by an optional "feature-expression ;"
* AppEnv: run-time environment variables needed by the application,
in format suitable for env, ie: VAR="val" VAR2="val2"
can be preceded by an optional "feature-expression ;"
* BenchmarkResult: marks harness tests which should report performance
information, and is a perl regex that is used to extract
the performance metric and units from the program output.
BenchmarkResult also implies SaveOutput.
* RequireFeature: Marks a test as relying on a comma-separated list of features
that must all be provided by the compilation environment
(otherwise the test is skipped)
May alternately be a feature-expression, where the test is selected
iff the expression evaluates to true.
Interesting feature values include:
trans_bupc,upc_io,upc_collective,upc_memcpy_async,upc_memcpy_vis,
[no]debug,[no]trace,[no]stats,packedsptr,structsptr,
[no]pthreads,network_[networktype] - see harness log for complete list
* ProhibitFeature: Marks a test as not supporting a comma-separated list of features
If the compilation environment includes any of these features, the
test is skipped
May alternately be a feature-expression, where the test is skipped
iff the expression evaluates to true.
* KnownFailure: Marks a test which is known to fail in a particular way (see below)
* WarningFilter: Gives a perl regex to be ignored in compiler output (see below)
Special strings (not all valid in every context):
* $TESTNAME$ will be replaced with the name of the test.
* $DEFAULT$ will be replaced with the default number value
specified in the harness.pl script.
* $THREADS$ will be replaced with the number of UPC threads
valid in Limit and TimeLimit
* $VARNAME$ will be replaced with the value of $VARNAME in the environment,
with an error if the variable is not set
* !VARNAME! will be replaced with the value of $VARNAME in the environment,
or empty (without error) if the variable is not set
Runtime pass/fail:
Three status values are recorded in the run.rpt file
for each application. Timeout, ExitCode and Match.
The most useful is the MATCH results. The harness scans the
standard output of the test run looking for strings that
match the FailExpr and PassExpr. If FailExpr is defined (non-zero)
and a match is found, the test fails. If no FailExpr is found
and a PassExpr is defined and found, the test passes. If PassExpr
is defined but not found, the test fails.
Definitions:
Timeout
========
ok The test ran within the allowed limit
FAILED The watchdog killed the job because it ran for too long
ExitCode
========
na The watchdog killed the job, no exit code
ignore Told to ignore the exit code
ok The exit code matched ExitCode.
FAILED The exit code differed from ExitCode
Match
========
ignore no regular expressions to match
ok no FailExpr is found, and PassExpr is found
FAILED FailExpr is found, or no PassExpr is found
KnownFailure
============
The KnownFailure syntax is very expressive to allow for precise definitions of
known problems:
KnownFailure: [mode[,mode...]]; feature-expression; bug desc...
Any given test may have multiple KnownFailure lines, and they are processed in
order - the first matching KnownFailure line (if any) will be the one whose
description is reported for a failure.
The mode list and feature-expression are optional, and default to a match, but
the semicolons must be present. One exception: for backwards-compatibility
this:
KnownFailure: desc...
is accepted as a shorthand for this:
KnownFailure: ; ; desc...
mode:
comma-delimited list of failure modes to which this known failure applies.
This should be set as specific as possible while still capturing all the known
failure modes for the known bug described in the description. Legal values are:
all - all failures
compile-all - all compile-time failures
compile-warning - compilation succeeds as expected, with an unexpected warning
compile-failure - all other forms of compilation failures (errors, unexpected success, etc)
run-all - all runtime failures
run-match - output match failure at runtime
run-crash - crash at runtime
run-time - timeout at runtime
run-mem - memory exhaustion at runtime
run-exit - bad exit code at runtime
feature-expression:
boolean expression of compiler features (eg "network_lapi && (cpu_64 ||
os_aix) && nodebug") that must return true for a configuration known to exhibit
the bug described in the description. Consult upcc -version to see the list of
features active for a given configuration.
The tokens "_threads" expands to the number of UPC threads
WarningFilter
=============
The WarningFilter syntax allows for specific lines of output from the compile
step to be ignored when detecting compile-warning or compile-failure.
WarningFilter: [feature-expression;] perl regex