Index of /download/dist/upc-examples/guppie
Fine-grained HPCChallenge RandomAccess (GUPS) in UPC
This directory contains three fine-grained implementations of HPCChallenge
RandomAccess (GUPS) in UPC.
All three versions use fine-grained put/get for performing updates on remote
table entries, and consequently on distributed-memory platforms are NOT
expected to be competetive with tuned implementations of GUPS that explicitly
coalesce communication and perform target-side updates. These are provided solely
as an algorithmic example of fine-grained communication in UPC, NOT as the
best possible or even recommended implementation of HPCChallenge RandomAccess.
The three versions:
* guppie.upc - put/gets on table entries are performed using language-level
shared array accesses. This version was originally written for the Cray T3E,
whose unique network hardware and custom UPC compiler (with some special flags)
allowed this version to compile to an executable that exposed some communication
overlap at runtime.
UPC compilers on modern systems are likely to compile this version to an
executable that uses fine-grained blocking puts and gets of shared memory, with
no communication overlap in the case of remote data.
* guppie-async.upc - this version performs fine-grained table updates using the
explicitly non-blocking transfer operations upc_mem{put,get}_nbi introduced
as an optional library in UPC spec v1.3.
This explicitly exposes communication-communication overlap between the
individual fine-grained gets and puts performed in each chunk of updates.
The data access pattern is otherwise unchanged from guppie.upc.
* guppie-async-pipeline.upc - this version goes a step further by additionally
using software pipelining to schedule the asynchronous communication, in
order to overlap computation with communication and reduce stalls waiting for
asynchronous communication completion.
Currently all versions also use a naive, serial verification step, which won't
scale to large thread counts.