Re: thread safety of UPC memory copy library funtions

From: Lingyuan Wang (lennyhpc_at_gmail_dot_com)
Date: Thu Mar 11 2010 - 18:02:57 PST

  • Next message: Nikita Andreev: "Expense of BUPC timer functions"
    Thanks Paul, that was really helpful.
    I tried -pthread=1, but as expected it does not work, still have not got a
    chance to use the "tv" library yet. I will simply sequentialize UPC runtime
    calls at this moment, but will keep you updated if anything is up.
    I referred to "single snd/rcv buffer per process" cause I was concerned
    about the fact that multiple sub-threads within a single process may cause
    contention if calling communication functions in parallel, like the case of
    native pthread conduit in UPC.
    On Wed, Mar 10, 2010 at 7:49 PM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>wrote:
    > Lingyuan,
    >  The only efforts I am aware of to layer threads on top of UPC have been
    > analogous to MPI_THREAD_FUNNELLED, in which one is permitted to have
    > multiple threads running but only one representative thread is permitted to
    > call (even implicitly) any code in the UPC runtime library.  If there is
    > anybody on this list with success with any other ways to mix threads with
    > UPC, I'd be interested to learn about it.
    >  The Berkeley UPC runtime and GASNet communication runtime libraries are
    > built in both thread-safe ("par") and non thread-safe ("seq")
    > configurations.  If one passes -pthreads to the upcc command, then the
    > thread-safe version is linked, and otherwise the non thread-safe version is
    > used.  My initial guess is that your segmentation fault is related to use of
    > the non thread-safe libraries.  That can be resolved  by passing -pthreads=1
    > to upcc.  This will ensure that the thread-safe libraries are linked, but
    > will run with one UPC thread per process.  However, that is probably not
    > enough... read on.
    >  I cannot be certain that a hybrid UPC+pthreads program as you describe
    > will work.  The reason I am uncertain is that thread-safe versions of both
    > the UPC and GASNet runtime libraries make use of thread-specific data.  For
    > instance if you try to reference the UPC built-in "MYTHREAD" from a pthread
    > that you have spawned it will almost certainly fail because in the
    > thread-safe library it is implemented via thread-specific data that has only
    > been allocated/initialized for the thread(s) that the UPC runtime has
    > spawned.  So, it appears you would require a runtime configuration that
    > provides for the thread safe invocation of UPC built-ins but without
    > assigning each thread an individual UPC-level identity.  Unfortunately, we
    > have not implemented such a configuration.  I don't have any immediate
    > estimate of what effort would be required, but for our support of the
    > TotalView debugger we did implement a mode in which there is a single UPC
    > thread per process plus a non-UPC thread to ensure remote accesses could
    > progress even when the debugger had frozen the UPC thread.  That work means
    > that within the UPC runtime there is already a separation of thread-safety
    > into two distinct parts:
    >  UPCRI_SUPPORT_PTHREADS - the UPC runtime is thread-safe (to at least the
    > extent needed for the TotalView support) and calls thread-safe GASNet
    >  UPCRI_UPC_PTHREADS - UPC threads are implemented as multiple pthreads per
    > process.
    > I suspect, but cannot verify, that linking in the libraries intended for
    > TotalView support will get you most of the way to what you appear to need.
    >  For instance, I believe (with 90%+ certainty) that MYTHREAD will not
    > utilize thread-specific data in the "tv" version of libupcr.    If you are
    > lucky the "tv" libraries might be sufficient for what you want.  If I (and
    > my peers here is\n Berkeley) were not so busy right now with several
    > deadlines rapidly approaching, I'd be interested in helping you at least
    > conduct the experiment to see if the "tv" libraries work for you.  Having
    > support for the sort of hybrid programming you describe would be interesting
    > to us.  So, I hope you can keep us up-to-date on any progress you make.
    > I am not certain what you are referring to when you say the IB driver only
    > allows one snd/rcv buffer per process.  We support upto
    > GASNET_NETWORKDEPTH_PP IB-level operations in flight to a given peer
    > (default 64), or GASNET_NETWORKDEPTH_TOTAL operations outstanding total
    > (default is computed from HCA resource limits), and will stall waiting for
    > at least one outstanding operation to complete when either limit is reached.
    > -Paul
    > Lingyuan Wang wrote:
    >> Greetings,
    >> I am concerning about the thread safety of UPC memory copy library
    >> functions, whether it is safe to call those routines in parallel from
    >> multiple threads?
    >> I use a multi-threading layer of Pthread on top of each UPC thread, at
    >> some points of the program I need to do all-to-all communications among all
    >> UPC threads. I am looking to call UPC memory copy functions directly from my
    >> pool of Pthreads in parallel, since the data is sliced locally (and it would
    >> be less efficient to pack the data for a bulk synchronized communication).
    >> However, I got segmentation fault when run more than one thread per thread
    >> pool, and the code works fine for single thread per pool cases.
    >> I am using the IBV conduit of Berkeley UPC 2.10, with PSHM enabled. I am
    >> aware the fact that InfiniBand driver allows one send/receive buffer per
    >> process. My further questions regarding the thread safety are, is is a
    >> network driver/conduit specific issue, or a general gasnet/BUPC runtime
    >> restriction? As I can not find any thread safety definition from the UPC
    >> spec, would it be possible to support it potentially? And how does the UPC
    >> native Pthead conduit handle it? Thanks in advance.
    >> --
    >> Regards
    > --
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group                 Tel: +1-510-495-2352
    > HPC Research Department                   Fax: +1-510-486-6900
    > Lawrence Berkeley National Laboratory

  • Next message: Nikita Andreev: "Expense of BUPC timer functions"