Manual Reference Pages  - UPCC (1)

NAME

upcc - the Berkeley UPC compiler, version 2022.10.0

CONTENTS

Synopsis
Description

SYNOPSIS

upcc [options] foo.upc [ bar.c someobject.o ... ]

DESCRIPTION

upcc is the front-end to the Berkeley UPC compiler suite. It handles all stages of the UPC compilation process: 1) initial preprocessing, 2) UPC-to-C translation, 3) back-end C compilation, and 4) linking.

upcc has the same interface as a traditional C compiler, plus some additional flags for UPC-specific options. Default options to be used for every compilation can be specified in the UPCC_FLAGS environment variable, and in the upcc configuration files (see below).

    Standard C compiler options:

-c Compile source files, but do not link.
-DFOO[=bar]
  Define preprocessor symbol FOO [to optional value].
-E Preprocess source files (output sent to stdout).
-g Generate debug objects/executables
-I path Add path to directories searched for header files.
-lfoo Link executable with libfoo.a.
-Ldir Add ’dir’ to library search path.
-O Generate optimized objects/executables Does *not* enable experimental translator optimizations.
-opt Enable EXPERIMENTAL UPC translator optimizations
-o name Output file will be called ’name’.
-pg Generate OS-specific sequential performance profiling information in the executable (on supported platforms)
-s Strip the symbolic information from the final executable.
-UBAR Undefine preprocessor symbol BAR.

    Multiconf options:

-show-confs
  Show the multiconf variations which are installed
(availability of following options varies based on configure-time decisions)
-bupc Use the Berkeley UPC translator [default]
-gupc Use the Intrepid GUPC translator
-cupc Use the Intrepid Clang UPC translator
-cupc2c Use the clang-upc2c translator
-g Enable system-wide debugging symbols and assertions
-trace Enable communication tracing & statistics for use with upc_trace
-inst Enable GASP-compliant instrumentation for 3rd party performance tools

    UPC-related options:

-network=<type>
  Set network API use for communication. Valid types include:
mpi, udp, smp, ibv, aries, ofi, ucx
Run ’upcc -version’ to see which are available in this installation, and which is the default.
-shared-heap=NUM
  Specify default amount (per UPC thread) of shared memory. Defaults to megabytes: use ’1GB’ for 1 gigabyte. Can override at startup via the UPC_SHARED_HEAP_SIZE environment variable.
-T=NUM Generate code for a fixed number NUM of UPC threads This allows optimization of certain operations (such as pointer-to-shared arithmetic), especially when NUM is a power of 2. The disgusting syntax -f(upc-)threads-NUM is also accepted, for compatibility with other UPC compilers.

    General options:

-h -? -help Print this message.
-conf=FILE
  Read FILE instead of $HOME/.upccrc configuration file.
-norc Do not read $HOME/.upccrc configuration file. This can also be achieved by setting the UPCC_NORC environment variable. Overrides -conf.
-smart-output
  Output file name will be auto-generated based on first .c/.upc/.o file on command line (ignored if -o passed).
-V -version
  Show version information.
-v Verbose: display programs invoked by compiler.
-vv Extra verbose: pass verbose flag to programs invoked.

    Advanced options:

-allow-deprecated
  Disable warnings for use of deprecated bupc_ functions.
-[no]checks
  Turn off build consistency checking. Caveat nerdtor...
-compress=NUM
  Specify a gzip compression level for the HTTP netcompile data stream, from 0 (off) to 9 (best). Higher values may speed compilation over slow links, at an increase in CPU cost.
-echo-var VAR
  Print value for VAR used by the internal upcc Makefile framework (for internal use only)
-extern-main
  Use if main() is declared in a non-UPC object or library.
-[no]fast-symptr
  Use fast symmetric pointers for power-of-two static threads. (Available only for ’-network=smp -pthreads’) If available, on by default if -T passed a power-of-two value.
-opt-enable=OPT1[,OPT2]
-opt-disable=OPT3[,OPT4]
  Selectively enable/disable specified optimizations in the BUPC UPC-to-C translator. See translator documentation for the available optimizations.
-http-proxy=URL
  Set an HTTP proxy to use for HTTP netcompile, overriding the http_proxy setting in the configuration file.
-[no]lines
  Insert line directives for original UPC code into translated C code (if applicable). On by default.
-[no]link-cache
  Disable the use of the pthread-link cache directory used to speed up linking of multi-file pthread applications.
-link-with <PROG>
  Use PROG to as the back-end linker. Use to combine UPC code with external C++ and/or MPI code.
-nightly Use nightly build of UPC-to-C translator at http://upc-translator.lbl.gov/upcc-nightly.cgi
-nopthreads
  Alias for -pthreads=0.
-print-include-dir Prints full path to directory in which <bupc_extern.h>
  is located.
-print-mpicc
  Prints full pathname of an MPI compiler compatible with this installation of upcc, or error if MPI not supported.
-inst-local
-inst-functions
  Used internally by GASP performance tool wrapper scripts. End-users should not normally need these options.
-trace Request that the compilation fail unless support for the upc_trace utility is available.
-pthreads[=N]
  Generate a pthreaded UPC executable, and optionally set the default number of pthreads per process (which can be overridden at startup via the UPC_PTHREADS_PER_PROC or UPC_PTHREADS_MAP environment variables). A value of N=0 disables creation of a pthreaded executable.
-[no]require-size
  Die at startup if amount of shared memory available is less than requested: off by default. Can be overridden at startup by setting the UPC_REQUIRE_SHARED_SIZE environment variable to ’yes’ or ’no’
-[no]save-temps
  Save ’interesting’ temporary files (.i, .trans.c, .o) generated during translation/compilation.
-[no]save-all-temps
  Save all files used during translation/compilation. Most are placed in a ’{target}_temps’ subdirectory.
-show-sizes
  Show the internally used platform sizes file
-[no]size-warn
  Warn at startup if amount of shared memory available is less than requested: on by default. Can be overridden at at startup by setting the UPC_SIZE_WARN environment variable to ’yes’ or ’no’.
-stable Use latest ’stable’ build of UPC-to-C translator at http://upc-translator.lbl.gov/upcc-stable.cgi
-trans Stop after translating UPC to C (outputs ’foo.trans.c’).
-translator=<path> Use UPC-to-C translator at <path>, which is formatted
  identically to the ’translator’ conf-file option
-uses-mpi MPI interoperability support. Pass at compile-time if a UPC file contains calls to MPI functions. Pass at link time if any objects (including libraries) use MPI.
-W?,<option>
  Pass an arbitrary option directly to a specific phase (determined by the character replacing ‘?’ as listed below) when invoked by the compiler driver. Use repeatedly to pass multiple options. If you need to use spaces, quote the option (ex: -Wp,"--option value"). Commas after the first DO NOT break the argument into multiple options as with some other compilers.
Supported phases:
-Wp,<option>
  Pass an arbitrary option to the UPC preprocessor.
-Wu,<option>
  Pass an arbitrary option to the first (or only) phase of source-to-source translation.
-Ww,<option>
  Pass an arbitrary option to the second (or only) phase of source-to-source translation.
-Wc,<option>
  Pass an arbitrary option to the compiler phase (the one generating object code from either UPC source or from translated C source).
-Wl,<option>
  Pass an arbitrary option to the linker. NOTE: In most configurations the "linker" will be a C compiler, not ld or its equivalent. So, to truly pass options to the system linker you need to get them past the C compiler first. For instance to pass "--foo bar" to ld when gcc is the C compiler, you will need
-Wl,-Wl,--foo,bar
Conventions for passing linker arguments through other C compilers will vary.
-yesterday|-hier
  Use yesterday’s UPC-to-C translator at http://upc-translator.lbl.gov/upcc-yesterday.cgi

UPC FILE EXTENSIONS

upcc recognizes both ‘.c’ and ‘.upc’ as valid file name extensions for UPC code. Header files may have either a ‘.uph’ or ‘.h’ extension.

‘.trans.c’ is recognized specially as a file which has already been translated (via a previous call to ‘upcc -trans’). upcc passes ‘.trans.c’ files directly to the C compiler/linker.

REGULAR C FILES/OBJECTS/LIBRARIES

Berkeley UPC is fully interoperable with regular C source, object, and library files. You may pass regular C files to upcc, include regular .h files in your UPC code, and link C-based libraries and object files into a UPC application.

CONFIGURATION FILES

upcc uses a site-wide ‘upcc.conf’ file to get some of its settings. You may override any of the settings found in the global ‘upcc.conf’ file with a user configuration file: ‘.upccrc’ your $HOME directory. Alternatively, you may pass ‘-conf=FILE’ to specify a user configuration file to be read in place of this default.

The user configuration file may look something like

var1 = value
var2 = value2
[section1]
var3 = value2
[section2]
var3 = value4

When the file is processed, every assignment before the first ‘[...]’ line is processed. Later assignments are processed only if the section name matches the library configuration selected by various compiler options such as ‘-g’, ‘-O’, ‘-gupc’ and ‘-inst’ (run ‘upcc -show-confs’ to list enabled configurations). The ‘[...]’ lines are typically of the form ‘[<config>]’ where ‘<config>’ is one of the library configurations. However, the section names are interpreted as perl regular expressions, allowing for instance ‘[.*_gupc]’ to define a section that will apply to both the ‘dbg_gupc’ and ‘opt_gupc’ configurations.

To choose a different default network (a.k.a. conduit) for your programs:

default_network=<conduit>

Where supported values will be a site-specific subset of: mpi, udp, smp, ibv, ucx, ofi, aries.

To specify flags to pass to upcc each time it is invoked, set ‘default_options’:

default_options = -save-all-temps -v -DFOO=bar

To specify flags to pass to upcc each time it is invoked for a specific network (a.k.a. conduit), set ‘<conduit>_options’:

mpi_options = -v -DUSING_MPI=1

To override the default amount of shared memory (per UPC thread) to be used by your UPC applications:

shared_heap = 128MB # or ‘2GB’, etc.

To use a different UPC-to-C translator:

translator = /path/to/translator # local translator
translator = http://foo.org/upcc.cgi # remote via HTTP
translator = foo.org:/path/to/translator # remote via SSH

See the Berkeley UPC User’s Guide for more information on using a remote translator.

To have upcc use the basename of the first file argument for the executable name (i.e., ‘upcc foo.upc bar.upc’ will produce ‘foo’ instead of ‘a.out’):

smart_output = yes #or put -smart-output in ‘default_options’

ENVIRONMENT VARIABLES

The UPCC_FLAGS environment variable can be set to pass any flags/arguments that you wish to use for every invocation of upcc. This is in addition to the ‘default_options’ parameter described above.

OPTION PROCESSING

Options are read from the site-wide and user-specific configuration files, the environment and the command-line. The precedence of options is equivalent to parsing the options in the following order:

default_options <conduit>_options UPCC_FLAGS command-line

For options which set a value (such as -T and -shared-heap), the last value seen is the one used. Thus values on the command-line always take precedence over any others.

The ‘default_options’ and ‘<conduit>_options’ are taken from your user configuration file (see CONFIGURATION FILES, above) if present there, or from the site-wide upcc.conf otherwise. If a given setting is present in both files only the settings in the user file are used; they are not additive. However, passing the option -norc on the command-line or setting the UPCC_NORC environment variable will disable reading of the user configuration file, causing ‘default_options’ and ‘<conduit>_options’ to be taken only from the site-wide upcc.conf.

Arguments in ‘default_options’, ‘<conduit>_options’ and UPCC_FLAGS are split on whitespace, but single- or double-quotes will suppress splitting. The backslash character ‘#146; does not have any special meaning.

To avoid ambiguity ‘-network=FOO’ is not allowed in the ‘<conduit>_options’ settings. The options ‘-norc’ and ‘-conf=FILE’ are only permitted on the command-line. However the affect of ‘-norc’ can also be achieved by setting UPCC_NORC.

Due to limitations in upcc and the tools it invokes, the following characters may not appear in any argument that denotes a file or directory name:


(){}[]<>"`'$|%^?:;!#&*\

Upcc is able to deal with whitespace characters in directory names, but not in file names. Additionally, some of the tools upcc relies on (e.g. some back-end MPI compilers) may not handle spaces in directory names (e.g. in arguments to -I and -L). Therefore, use of whitespace in file names is prohibited, and use in directory names is strongly discouraged.

CONTROLLING OPTIMIZATIONS

    Optimization Related Options

The BUPC translator supports several UPC specific optimizations. The upcc driver provides the -opt command line option to enable a default set of optimizations. In addition, optimizations can be individually enabled/disabled using the -opt-enable and -opt-disable driver options:

-opt-enable=LIST
-opt-disable=LIST

These commands take the form shown above, where LIST is a comma separated list of individual optimization names (enumerated below). Example:

-opt-disable=pre-add

The optimizations currently supported by BUPC are:

split-phase
pre-add
ptr-coalesce
ptr-locality
forall-opt
msg-vect

Invoking upcc with the -opt option will enable by default: pre-add and ptr-coalesce. Note, however, that these defaults may change in a future release.

The BUPC translator allows a per function control over optimizations using pragmas. For example

#pragma bupc noopt
void F() {}
will disable any optimization during the compilation of F(), regardless of the upcc command line arguments. The upcc command line arguments will determine the level of optimization for all other functions present in the same file as F().

    Optimization Passes

o split-phase
o pre-add
These two optimizations enable program transformations for pointer based codes written with fine grained memory accesses. Consider this sample code:

shared int *p;
int x,i;
...
x = p[i];

Without optimizations, the dereference p[i] will be performed using a blocking communication call and no overlap is exploited. The split-phase optimization enables a transformation pass that generates non-blocking communication calls and moves as far apart as possible in the program the initiation of the operation from its completion. This optimization is designed to increase the amount of overlap present in the application.

Pointer arithmetic on pointers to shared (PTS) is an expensive operation. The pre-add optimization attempts to reduce the number of PTS arithmetic operations at runtime by performing a partial redundancy elimination transformation. This optimization is useful for both pointer and array based codes written in a fine grained style.

Note that the pre-add optimization performs speculative code motion and it might result in code that will fail runtime assertions when using versions of the UPC runtime library built with debug options. Versions built with optimized runtime libraries will perform correctly.

For more information on these optimizations see: ‘‘Communication Optimizations for Fine-grained UPC Applications’’ W. Chen, C. Iancu, K. Yelick. 14th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2005.

o ptr-coalesce

This optimization is beneficial for pointer based programs using aggregate data types. Given

struct S {
int x;
int y;
};
...
shared struct S *p;
int x;
int y;
...
x = p->x;
y = p->y;

The unoptimized code will perform two network transfers. The ptr-coalesce optimization detects accesses to contiguous fields within an aggregate data type and efficiently transfers the data using a single communication operation.

o ptr-locality

This enables an intra-procedural analysis able to replace accesses using PTS with accesses using ‘C’ pointers within one function. The transformation is designed to facilitate data initialization and associates calls to upc_alloc() (memory allocation with local affinity) to the uses of the returned pointer. The transformed program will benefit from lower PTS arithmetic overhead and faster data access. This optimization has to be explicitly requested using the -opt-enable option.

o forall-opt

Given a forall loop

upc_forall(...; ...; ...; aff)
this optimization analyzes the affinity expression (aff), determines the iterations of the loop with local affinity and generates efficient serial code without affinity tests. This optimization has to be explicitly requested using the -opt-enable option.

o msg-vect

This option enables an alpha release of a loop optimization package designed for array based programs. This feature is EXPERIMENTAL and has to be explicitly requested using the -opt-enable option. For a more detailed description see:

‘‘Performance Portable Optimizations for Loops Containing Communication Operations’’ C. Iancu, W. Chen, K. Yelick. International Conference on Supercomputing 2008 (ICS 2008).

For a given loop nest, the optimization detects the memory regions accessed through PTS and generates efficient code for the transfer of the remote data. There are two types of code transformations: strip-mining transformations and detection of ‘scatter-gather’ operations.

The optimizations use a combination of compile time and runtime analysis. The runtime analysis allows user customization and is implemented in the upcr_trans_extra_vect.{c,h} files in the libupc directory of the BUPC translator installation. These files are recompiled and linked against the user application at any upcc invocation.

Vectorization for a given loop nest can be requested using pragmas, i.e.

#pragma bupc ivdep
for(i = 0; ...; ...) { }

The performance models in upcrt_trans_extra_vect.c have been tuned for several systems, identified by their name. For unknown systems the defaults are: no strip mining and scatter-gather (VIS) code generation. Note that the GASNet implementation of VIS operations can be controlled using the GASNET_VIS_AMPIPE environment variable.

When message vectorization is requested with -opt-enable=msg-vect, the BUPC translator is able to print a report of the attempted transformations and their success. This functionality is enabled passing -Wu, -Wb,-trace-msg-vect as in this example:

upcc -opt-enable=msg-vect -Wu,"-Wb,-trace-msg-vect" ...

A MAKEFILE EXAMPLE

The following small Makefile shows how you might handle the .upc extension if you use ‘make’ to build your programs:

_______________________________________________________

# A simple Makefile for building UPC programs

TARGET = foobar
UPCC = upcc
UPCFLAGS = -g

UPC_OBJS = foo.o bar.o

.SUFFIXES:
.SUFFIXES: .upc .o

# suffix rule for compiling .upc files
.upc.o:
$(UPCC) -c $(UPCFLAGS) -o $@ $<

$(TARGET): $(UPC_OBJS)
$(UPCC) $(UPCFLAGS) -o $(TARGET) $(UPC_OBJS)
_______________________________________________________

$make
upcc -c -g foo.upc
upcc -c -g bar.upc
upcc -g -o foobar foo.o bar.o

Alternatively, if you use the .c extension for your UPC files, you can simply set the CC variable in your Makefile (or your shell environment) to ‘upcc’, after which the regular make rules for C files will handle your UPC files (and the standard CFLAGS variable can be set to pass upcc options).

STATIC VS DYNAMIC THREADS

When invoked with the ‘-T’ option, upcc will generate an executable (or object file if ‘-c’ is also passed) that can only be run with a fixed thread count (the argument to ‘-T’). Such a compilation is defined by the UPC specification as taking place in a ‘static THREADS’ environment, as compared to a ‘dynamic THREADS’ environment when ‘-T’ is not used. As an extension, Berkeley UPC allows linking of objects compiled with dynamic THREADS with those compiled with static THREADS. The result is always an executable able to run only with the fixed thread count of the static THREADS object file(s).

Use of dynamic THREADS objects in this manner provides the flexibility to compile a single object from UPC sources for use with multiple possible THREAD values, while still allowing the remainder of ones code to be compiled for static THREADS with its potential for more aggressive compiler optimizations. In this way dynamic THREADS objects can be used much like libraries. However, one should be aware of certain limitations:

o This practice may not be supported by other UPC compilers.
o The dynamic THREADS object(s) will not include any optimizations that might be available exclusively in a static THREADS environment.
o The rules regarding use of THREADS in variable declarations in a dynamic THREADS environment do not change when linking with static THREADS objects.
o The terms ‘static’ and ‘dynamic’ used with respect to UPC THREADS have no relation to the terms ’static library’ and ’dynamic library’.

REPORTING BUGS

We are very interested in fixing any bugs in our UPC implementation. For bug reporting instructions, please go to https://upc.lbl.gov.

SEE ALSO

upcrun(1), upc_trace(1)

The Berkeley UPC User’s Guide (available at https://upc.lbl.gov)


Berkeley UPC UPCC (1) October 2022
Generated by manServer 1.07 from upcc.1 using man macros.