From: Lingyuan Wang (lennyhpc_at_gmail_dot_com)
Date: Thu Mar 11 2010 - 18:02:57 PST
Thanks Paul, that was really helpful.
I tried -pthread=1, but as expected it does not work, still have not got a
chance to use the "tv" library yet. I will simply sequentialize UPC runtime
calls at this moment, but will keep you updated if anything is up.
I referred to "single snd/rcv buffer per process" cause I was concerned
about the fact that multiple sub-threads within a single process may cause
contention if calling communication functions in parallel, like the case of
native pthread conduit in UPC.
On Wed, Mar 10, 2010 at 7:49 PM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>wrote:
> Lingyuan,
>
> The only efforts I am aware of to layer threads on top of UPC have been
> analogous to MPI_THREAD_FUNNELLED, in which one is permitted to have
> multiple threads running but only one representative thread is permitted to
> call (even implicitly) any code in the UPC runtime library. If there is
> anybody on this list with success with any other ways to mix threads with
> UPC, I'd be interested to learn about it.
>
> The Berkeley UPC runtime and GASNet communication runtime libraries are
> built in both thread-safe ("par") and non thread-safe ("seq")
> configurations. If one passes -pthreads to the upcc command, then the
> thread-safe version is linked, and otherwise the non thread-safe version is
> used. My initial guess is that your segmentation fault is related to use of
> the non thread-safe libraries. That can be resolved by passing -pthreads=1
> to upcc. This will ensure that the thread-safe libraries are linked, but
> will run with one UPC thread per process. However, that is probably not
> enough... read on.
>
> I cannot be certain that a hybrid UPC+pthreads program as you describe
> will work. The reason I am uncertain is that thread-safe versions of both
> the UPC and GASNet runtime libraries make use of thread-specific data. For
> instance if you try to reference the UPC built-in "MYTHREAD" from a pthread
> that you have spawned it will almost certainly fail because in the
> thread-safe library it is implemented via thread-specific data that has only
> been allocated/initialized for the thread(s) that the UPC runtime has
> spawned. So, it appears you would require a runtime configuration that
> provides for the thread safe invocation of UPC built-ins but without
> assigning each thread an individual UPC-level identity. Unfortunately, we
> have not implemented such a configuration. I don't have any immediate
> estimate of what effort would be required, but for our support of the
> TotalView debugger we did implement a mode in which there is a single UPC
> thread per process plus a non-UPC thread to ensure remote accesses could
> progress even when the debugger had frozen the UPC thread. That work means
> that within the UPC runtime there is already a separation of thread-safety
> into two distinct parts:
> UPCRI_SUPPORT_PTHREADS - the UPC runtime is thread-safe (to at least the
> extent needed for the TotalView support) and calls thread-safe GASNet
> UPCRI_UPC_PTHREADS - UPC threads are implemented as multiple pthreads per
> process.
>
> I suspect, but cannot verify, that linking in the libraries intended for
> TotalView support will get you most of the way to what you appear to need.
> For instance, I believe (with 90%+ certainty) that MYTHREAD will not
> utilize thread-specific data in the "tv" version of libupcr. If you are
> lucky the "tv" libraries might be sufficient for what you want. If I (and
> my peers here is\n Berkeley) were not so busy right now with several
> deadlines rapidly approaching, I'd be interested in helping you at least
> conduct the experiment to see if the "tv" libraries work for you. Having
> support for the sort of hybrid programming you describe would be interesting
> to us. So, I hope you can keep us up-to-date on any progress you make.
>
> I am not certain what you are referring to when you say the IB driver only
> allows one snd/rcv buffer per process. We support upto
> GASNET_NETWORKDEPTH_PP IB-level operations in flight to a given peer
> (default 64), or GASNET_NETWORKDEPTH_TOTAL operations outstanding total
> (default is computed from HCA resource limits), and will stall waiting for
> at least one outstanding operation to complete when either limit is reached.
>
> -Paul
>
>
> Lingyuan Wang wrote:
>
>> Greetings,
>> I am concerning about the thread safety of UPC memory copy library
>> functions, whether it is safe to call those routines in parallel from
>> multiple threads?
>>
>> I use a multi-threading layer of Pthread on top of each UPC thread, at
>> some points of the program I need to do all-to-all communications among all
>> UPC threads. I am looking to call UPC memory copy functions directly from my
>> pool of Pthreads in parallel, since the data is sliced locally (and it would
>> be less efficient to pack the data for a bulk synchronized communication).
>> However, I got segmentation fault when run more than one thread per thread
>> pool, and the code works fine for single thread per pool cases.
>> I am using the IBV conduit of Berkeley UPC 2.10, with PSHM enabled. I am
>> aware the fact that InfiniBand driver allows one send/receive buffer per
>> process. My further questions regarding the thread safety are, is is a
>> network driver/conduit specific issue, or a general gasnet/BUPC runtime
>> restriction? As I can not find any thread safety definition from the UPC
>> spec, would it be possible to support it potentially? And how does the UPC
>> native Pthead conduit handle it? Thanks in advance.
>>
>> --
>> Regards
>>
>>
>
> --
> Paul H. Hargrove PHHargrove_at_lbl_dot_gov
> Future Technologies Group Tel: +1-510-495-2352
> HPC Research Department Fax: +1-510-486-6900
> Lawrence Berkeley National Laboratory
>
--
Regards