From: Gary Funck (gary_at_intrepid_dot_com)
Date: Thu Feb 12 2009 - 10:50:51 PST
On 02/12/09 12:49:32, Steven D. Vormwald wrote:
> Hello,
>
> Are there any "standard" UPC benchmarks available, preferably with  
> fine-grained (few upc_mem{cpy|get|put} or collective calls) remote  
> memory accesses?  A cursory search of the various University project  
> pages and the wiki hasn't revealed any, but I thought I'd ask before  
> implementing some on my own.
Steven,
A student at MTU, Zhang Zhang, presented some UPC benchmark results
back in 2004/2005:
http://www.upc.mtu.edu/papers/ZhangIPDPS05.pdf
http://upc.gwu.edu/~upc/upcworkshop04/MTU-upcworkshop04.pdf
We looked at those his paper, and those benchmarks, and noted
some methodological errors.  Notably, a buggy version of the NPB benchmark
developed by GWU was utilized which skewed results and led to some
false indications of failures when run on various platforms.
This led to apparent "no shows" by various compilers.
A couple of years ago, we collected UPC benchmarks from various
sources, and re-worked them so that they (1) execute enough iterations
to be meaningful on modern hardware, (2) did not print extraneous
output during the timing run part of the benchmark, and (3) were run
in a dedicated OS environment (run level 1 on Linux) to avoid
extraneous timing noise created by normal OS activities (4) sufficient
runs of the benchmarks were made to obtain a representative timing
sample.  We found that all these steps were necessary to obtain
reasonable timing results.  During that process, we did not attempt
to verify that each benchmark measured exactly what it was trying
to measure in an effective fashion.  Further, we didn't try to
verify that complex benchmarks (like NPB) produced correct results.
Although I commend Zhang Zhang for advancing knowledge in the
area of UPC performance -- due to methodological errors it is
unfortunate that his paper is the seminal work in this area.
I'd like to see his experiments re-done with the errors corrected,
and run against current compilers and runtime systems.
A procedural recommendation: while developing and selecting
benchmarks and collecting initial results, I'd encourage
that the results be run by each vendor involved to ensure that
the compiler was executed with appropriate paramaters and to
give the vendor the opportunity to fix small errors/bugs,
and to verify that the benchmarks in fact measure the
feature as intended.
- Gary