[DESCRIPTION] upcc is the front-end to the Berkeley UPC compiler suite. It handles all stages of the UPC compilation process: 1) initial preprocessing, 2) UPC-to-C translation, 3) back-end C compilation, and 4) linking. upcc has the same interface as a traditional C compiler, plus some additional flags for UPC-specific options. Default options to be used for every compilation can be specified in the UPCC_FLAGS environment variable, and in the upcc configuration files (see below). [UPC FILE EXTENSIONS] upcc recognizes both `.c' and `.upc' as valid file name extensions for UPC code. Header files may have either a `.uph' or `.h' extension. `.trans.c' is recognized specially as a file which has already been translated (via a previous call to `upcc -trans'). upcc passes `.trans.c' files directly to the C compiler/linker. [REGULAR C FILES/OBJECTS/LIBRARIES] Berkeley UPC is fully interoperable with regular C source, object, and library files. You may pass regular C files to upcc, include regular .h files in your UPC code, and link C-based libraries and object files into a UPC application. [CONFIGURATION FILES] upcc uses a site-wide `upcc.conf' file to get some of its settings. You may override any of the settings found in the global `upcc.conf' file with a user configuration file: `.upccrc' your $HOME directory. Alternatively, you may pass `-conf=FILE' to specify a user configuration file to be read in place of this default. The user configuration file may look something like var1 = value var2 = value2 [section1] var3 = value2 [section2] var3 = value4 When the file is processed, every assignment before the first `[...]' line is processed. Later assignments are processed only if the section name matches the library configuration selected by various compiler options such as `-g', `-O', `-gupc' and `-inst' (run `upcc -show-confs' to list enabled configurations). The `[...]' lines are typically of the form `[]' where `' is one of the library configurations. However, the section names are interpreted as perl regular expressions, allowing for instance `[.*_gupc]' to define a section that will apply to both the `dbg_gupc' and `opt_gupc' configurations. To choose a different default network (a.k.a. conduit) for your programs: default_network= Where supported values will be a site-specific subset of: mpi, udp, smp, ibv, ucx, ofi, aries. To specify flags to pass to upcc each time it is invoked, set `default_options': default_options = -save-all-temps -v -DFOO=bar To specify flags to pass to upcc each time it is invoked for a specific network (a.k.a. conduit), set `_options': mpi_options = -v -DUSING_MPI=1 To override the default amount of shared memory (per UPC thread) to be used by your UPC applications: shared_heap = 128MB # or `2GB', etc. To use a different UPC-to-C translator: translator = /path/to/translator # local translator translator = http://foo.org/upcc.cgi # remote via HTTP translator = foo.org:/path/to/translator # remote via SSH See the Berkeley UPC User's Guide for more information on using a remote translator. To have upcc use the basename of the first file argument for the executable name (i.e., `upcc foo.upc bar.upc' will produce `foo' instead of `a.out'): smart_output = yes #or put -smart-output in `default_options' [ENVIRONMENT VARIABLES] The UPCC_FLAGS environment variable can be set to pass any flags/arguments that you wish to use for every invocation of upcc. This is in addition to the `default_options' parameter described above. [OPTION PROCESSING] Options are read from the site-wide and user-specific configuration files, the environment and the command-line. The precedence of options is equivalent to parsing the options in the following order: default_options _options UPCC_FLAGS command-line For options which set a value (such as -T and -shared-heap), the last value seen is the one used. Thus values on the command-line always take precedence over any others. The `default_options' and `_options' are taken from your user configuration file (see CONFIGURATION FILES, above) if present there, or from the site-wide upcc.conf otherwise. If a given setting is present in both files only the settings in the user file are used; they are not additive. However, passing the option -norc on the command-line or setting the UPCC_NORC environment variable will disable reading of the user configuration file, causing `default_options' and `_options' to be taken only from the site-wide upcc.conf. Arguments in `default_options', `_options' and UPCC_FLAGS are split on whitespace, but single- or double-quotes will suppress splitting. The backslash character `\\' does not have any special meaning. To avoid ambiguity `-network=FOO' is not allowed in the `_options' settings. The options `-norc' and `-conf=FILE' are only permitted on the command-line. However the affect of `-norc' can also be achieved by setting UPCC_NORC. Due to limitations in upcc and the tools it invokes, the following characters may not appear in any argument that denotes a file or directory name: .RS (){}[]<>"\`\'$|%^?:;!#&*\\ .RE Upcc is able to deal with whitespace characters in directory names, but not in file names. Additionally, some of the tools upcc relies on (e.g. some back-end MPI compilers) may not handle spaces in directory names (e.g. in arguments to -I and -L). Therefore, use of whitespace in file names is prohibited, and use in directory names is strongly discouraged. [CONTROLLING OPTIMIZATIONS] .SS "Optimization Related Options" The BUPC translator supports several UPC specific optimizations. The upcc driver provides the .B -opt command line option to enable a default set of optimizations. In addition, optimizations can be individually enabled/disabled using the .B -opt-enable and .B -opt-disable driver options: .RS .B -opt-enable=LIST .br .B -opt-disable=LIST .RE These commands take the form shown above, where .I LIST is a comma separated list of individual optimization names (enumerated below). Example: .RS .B -opt-disable=pre-add .RE The optimizations currently supported by BUPC are: .RS split-phase .br pre-add .br ptr-coalesce .br ptr-locality .br forall-opt .br msg-vect .RE Invoking upcc with the .B -opt option will enable by default: .B pre-add and .BR ptr-coalesce . Note, however, that these defaults may change in a future release. The BUPC translator allows a per function control over optimizations using pragmas. For example .RS #pragma bupc noopt .br void F() {} .RE will disable any optimization during the compilation of .BR F() , regardless of the upcc command line arguments. The upcc command line arguments will determine the level of optimization for all other functions present in the same file as .BR F() . .SS "Optimization Passes" .B \(bu split-phase .br .B \(bu pre-add .br These two optimizations enable program transformations for pointer based codes written with fine grained memory accesses. Consider this sample code: .RS shared int *p; .br int x,i; .br \.\.\. .br x = p[i]; .RE Without optimizations, the dereference .B p[i] will be performed using a blocking communication call and no overlap is exploited. The .B split-phase optimization enables a transformation pass that generates non-blocking communication calls and moves as far apart as possible in the program the initiation of the operation from its completion. This optimization is designed to increase the amount of overlap present in the application. Pointer arithmetic on pointers to shared (PTS) is an expensive operation. The .B pre-add optimization attempts to reduce the number of PTS arithmetic operations at runtime by performing a partial redundancy elimination transformation. This optimization is useful for both pointer and array based codes written in a fine grained style. Note that the .B pre-add optimization performs speculative code motion and it might result in code that will fail runtime assertions when using versions of the UPC runtime library built with debug options. Versions built with optimized runtime libraries will perform correctly. For more information on these optimizations see: ``Communication Optimizations for Fine-grained UPC Applications'' W. Chen, C. Iancu, K. Yelick. .I 14th International Conference on Parallel Architectures and Compilation Techniques .IR (PACT) , 2005. .B \(bu ptr-coalesce This optimization is beneficial for pointer based programs using aggregate data types. Given .RS struct S { .br int x; .br int y; .br }; .br \.\.\. .br shared struct S *p; .br int x; .br int y; .br \.\.\. .br x = p->x; .br y = p->y; .RE The unoptimized code will perform two network transfers. The .B ptr-coalesce optimization detects accesses to contiguous fields within an aggregate data type and efficiently transfers the data using a single communication operation. .B \(bu ptr-locality This enables an intra-procedural analysis able to replace accesses using PTS with accesses using `C' pointers within one function. The transformation is designed to facilitate data initialization and associates calls to .BR upc_alloc () (memory allocation with local affinity) to the uses of the returned pointer. The transformed program will benefit from lower PTS arithmetic overhead and faster data access. This optimization has to be explicitly requested using the .B -opt-enable option. .B \(bu forall-opt Given a forall loop .RS upc_forall(...; ...; ...; .IR aff ) .RE this optimization analyzes the affinity expression .RI ( aff ), determines the iterations of the loop with local affinity and generates efficient serial code without affinity tests. This optimization has to be explicitly requested using the .B -opt-enable option. .B \(bu msg-vect This option enables an alpha release of a loop optimization package designed for array based programs. This feature is EXPERIMENTAL and has to be explicitly requested using the .B -opt-enable option. For a more detailed description see: ``Performance Portable Optimizations for Loops Containing Communication Operations'' C. Iancu, W. Chen, K. Yelick. .I International Conference on Supercomputing 2008 (ICS 2008). For a given loop nest, the optimization detects the memory regions accessed through PTS and generates efficient code for the transfer of the remote data. There are two types of code transformations: strip-mining transformations and detection of `scatter-gather' operations. The optimizations use a combination of compile time and runtime analysis. The runtime analysis allows user customization and is implemented in the upcr_trans_extra_vect.{c,h} files in the libupc directory of the BUPC translator installation. These files are recompiled and linked against the user application at any upcc invocation. Vectorization for a given loop nest can be requested using pragmas, i.e. .RS #pragma bupc ivdep .br for(i = 0; ...; ...) { } .RE The performance models in upcrt_trans_extra_vect.c have been tuned for several systems, identified by their name. For unknown systems the defaults are: no strip mining and scatter-gather (VIS) code generation. Note that the GASNet implementation of VIS operations can be controlled using the GASNET_VIS_AMPIPE environment variable. When message vectorization is requested with .BR -opt-enable=msg-vect , the BUPC translator is able to print a report of the attempted transformations and their success. This functionality is enabled passing .B -Wu,"-Wb,-trace-msg-vect" as in this example: .RS upcc -opt-enable=msg-vect -Wu,"-Wb,-trace-msg-vect" ... .RE [A MAKEFILE EXAMPLE] The following small Makefile shows how you might handle the .upc extension if you use `make' to build your programs: _______________________________________________________ # A simple Makefile for building UPC programs TARGET = foobar UPCC = upcc UPCFLAGS = -g UPC_OBJS = foo.o bar.o .SUFFIXES: .SUFFIXES: .upc .o # suffix rule for compiling .upc files .upc.o: $(UPCC) -c $(UPCFLAGS) -o $@ $< $(TARGET): $(UPC_OBJS) $(UPCC) $(UPCFLAGS) -o $(TARGET) $(UPC_OBJS) _______________________________________________________ $make upcc -c -g foo.upc upcc -c -g bar.upc upcc -g -o foobar foo.o bar.o Alternatively, if you use the .c extension for your UPC files, you can simply set the CC variable in your Makefile (or your shell environment) to `upcc', after which the regular make rules for C files will handle your UPC files (and the standard CFLAGS variable can be set to pass upcc options). [STATIC VS DYNAMIC THREADS] When invoked with the `-T' option, upcc will generate an executable (or object file if `-c' is also passed) that can only be run with a fixed thread count (the argument to `-T'). Such a compilation is defined by the UPC specification as taking place in a `static THREADS' environment, as compared to a `dynamic THREADS' environment when `-T' is not used. As an extension, Berkeley UPC allows linking of objects compiled with dynamic THREADS with those compiled with static THREADS. The result is always an executable able to run only with the fixed thread count of the static THREADS object file(s). Use of dynamic THREADS objects in this manner provides the flexibility to compile a single object from UPC sources for use with multiple possible THREAD values, while still allowing the remainder of ones code to be compiled for static THREADS with its potential for more aggressive compiler optimizations. In this way dynamic THREADS objects can be used much like libraries. However, one should be aware of certain limitations: .RS .B \(bu This practice may not be supported by other UPC compilers. .br .B \(bu The dynamic THREADS object(s) will not include any optimizations that might be available exclusively in a static THREADS environment. .br .B \(bu The rules regarding use of THREADS in variable declarations in a dynamic THREADS environment do not change when linking with static THREADS objects. .br .B \(bu The terms `static' and `dynamic' used with respect to UPC THREADS have no relation to the terms 'static library' and 'dynamic library'. .RE [REPORTING BUGS] We are very interested in fixing any bugs in our UPC implementation. For bug reporting instructions, please go to https://upc.lbl.gov. [SEE ALSO] upcrun(1), upc_trace(1) The Berkeley UPC User's Guide (available at https://upc.lbl.gov)