Re: thread safety of UPC memory copy library funtions

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Mar 10 2010 - 16:49:47 PST

  • Next message: Lingyuan Wang: "Re: thread safety of UPC memory copy library funtions"
      The only efforts I am aware of to layer threads on top of UPC have 
    been analogous to MPI_THREAD_FUNNELLED, in which one is permitted to 
    have multiple threads running but only one representative thread is 
    permitted to call (even implicitly) any code in the UPC runtime 
    library.  If there is anybody on this list with success with any other 
    ways to mix threads with UPC, I'd be interested to learn about it.
      The Berkeley UPC runtime and GASNet communication runtime libraries 
    are built in both thread-safe ("par") and non thread-safe ("seq") 
    configurations.  If one passes -pthreads to the upcc command, then the 
    thread-safe version is linked, and otherwise the non thread-safe version 
    is used.  My initial guess is that your segmentation fault is related to 
    use of the non thread-safe libraries.  That can be resolved  by passing 
    -pthreads=1 to upcc.  This will ensure that the thread-safe libraries 
    are linked, but will run with one UPC thread per process.  However, that 
    is probably not enough... read on.
       I cannot be certain that a hybrid UPC+pthreads program as you 
    describe will work.  The reason I am uncertain is that thread-safe 
    versions of both the UPC and GASNet runtime libraries make use of 
    thread-specific data.  For instance if you try to reference the UPC 
    built-in "MYTHREAD" from a pthread that you have spawned it will almost 
    certainly fail because in the thread-safe library it is implemented via 
    thread-specific data that has only been allocated/initialized for the 
    thread(s) that the UPC runtime has spawned.  So, it appears you would 
    require a runtime configuration that provides for the thread safe 
    invocation of UPC built-ins but without assigning each thread an 
    individual UPC-level identity.  Unfortunately, we have not implemented 
    such a configuration.  I don't have any immediate estimate of what 
    effort would be required, but for our support of the TotalView debugger 
    we did implement a mode in which there is a single UPC thread per 
    process plus a non-UPC thread to ensure remote accesses could progress 
    even when the debugger had frozen the UPC thread.  That work means that 
    within the UPC runtime there is already a separation of thread-safety 
    into two distinct parts:
       UPCRI_SUPPORT_PTHREADS - the UPC runtime is thread-safe (to at least 
    the extent needed for the TotalView support) and calls thread-safe GASNet
       UPCRI_UPC_PTHREADS - UPC threads are implemented as multiple pthreads 
    per process.
    I suspect, but cannot verify, that linking in the libraries intended for 
    TotalView support will get you most of the way to what you appear to 
    need.  For instance, I believe (with 90%+ certainty) that MYTHREAD will 
    not utilize thread-specific data in the "tv" version of libupcr.    If 
    you are lucky the "tv" libraries might be sufficient for what you want.  
    If I (and my peers here is\n Berkeley) were not so busy right now with 
    several deadlines rapidly approaching, I'd be interested in helping you 
    at least conduct the experiment to see if the "tv" libraries work for 
    you.  Having support for the sort of hybrid programming you describe 
    would be interesting to us.  So, I hope you can keep us up-to-date on 
    any progress you make.
    I am not certain what you are referring to when you say the IB driver 
    only allows one snd/rcv buffer per process.  We support upto 
    GASNET_NETWORKDEPTH_PP IB-level operations in flight to a given peer 
    (default 64), or GASNET_NETWORKDEPTH_TOTAL operations outstanding total 
    (default is computed from HCA resource limits), and will stall waiting 
    for at least one outstanding operation to complete when either limit is 
    Lingyuan Wang wrote:
    > Greetings, 
    > I am concerning about the thread safety of UPC memory copy library 
    > functions, whether it is safe to call those routines in parallel from 
    > multiple threads?
    > I use a multi-threading layer of Pthread on top of each UPC thread, at 
    > some points of the program I need to do all-to-all communications 
    > among all UPC threads. I am looking to call UPC memory copy functions 
    > directly from my pool of Pthreads in parallel, since the data is 
    > sliced locally (and it would be less efficient to pack the data for a 
    > bulk synchronized communication). However, I got segmentation fault 
    > when run more than one thread per thread pool, and the code works fine 
    > for single thread per pool cases. 
    > I am using the IBV conduit of Berkeley UPC 2.10, with PSHM enabled. I 
    > am aware the fact that InfiniBand driver allows one send/receive 
    > buffer per process. My further questions regarding the thread safety 
    > are, is is a network driver/conduit specific issue, or a general 
    > gasnet/BUPC runtime restriction? As I can not find any thread safety 
    > definition from the UPC spec, would it be possible to support it 
    > potentially? And how does the UPC native Pthead conduit handle it? 
    > Thanks in advance.
    > -- 
    > Regards
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 Tel: +1-510-495-2352
    HPC Research Department                   Fax: +1-510-486-6900
    Lawrence Berkeley National Laboratory     

  • Next message: Lingyuan Wang: "Re: thread safety of UPC memory copy library funtions"