From: Lingyuan Wang (lennyhpc_at_gmail_dot_com)
Date: Thu Mar 11 2010 - 18:02:57 PST
Thanks Paul, that was really helpful. I tried -pthread=1, but as expected it does not work, still have not got a chance to use the "tv" library yet. I will simply sequentialize UPC runtime calls at this moment, but will keep you updated if anything is up. I referred to "single snd/rcv buffer per process" cause I was concerned about the fact that multiple sub-threads within a single process may cause contention if calling communication functions in parallel, like the case of native pthread conduit in UPC. On Wed, Mar 10, 2010 at 7:49 PM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>wrote: > Lingyuan, > > The only efforts I am aware of to layer threads on top of UPC have been > analogous to MPI_THREAD_FUNNELLED, in which one is permitted to have > multiple threads running but only one representative thread is permitted to > call (even implicitly) any code in the UPC runtime library. If there is > anybody on this list with success with any other ways to mix threads with > UPC, I'd be interested to learn about it. > > The Berkeley UPC runtime and GASNet communication runtime libraries are > built in both thread-safe ("par") and non thread-safe ("seq") > configurations. If one passes -pthreads to the upcc command, then the > thread-safe version is linked, and otherwise the non thread-safe version is > used. My initial guess is that your segmentation fault is related to use of > the non thread-safe libraries. That can be resolved by passing -pthreads=1 > to upcc. This will ensure that the thread-safe libraries are linked, but > will run with one UPC thread per process. However, that is probably not > enough... read on. > > I cannot be certain that a hybrid UPC+pthreads program as you describe > will work. The reason I am uncertain is that thread-safe versions of both > the UPC and GASNet runtime libraries make use of thread-specific data. For > instance if you try to reference the UPC built-in "MYTHREAD" from a pthread > that you have spawned it will almost certainly fail because in the > thread-safe library it is implemented via thread-specific data that has only > been allocated/initialized for the thread(s) that the UPC runtime has > spawned. So, it appears you would require a runtime configuration that > provides for the thread safe invocation of UPC built-ins but without > assigning each thread an individual UPC-level identity. Unfortunately, we > have not implemented such a configuration. I don't have any immediate > estimate of what effort would be required, but for our support of the > TotalView debugger we did implement a mode in which there is a single UPC > thread per process plus a non-UPC thread to ensure remote accesses could > progress even when the debugger had frozen the UPC thread. That work means > that within the UPC runtime there is already a separation of thread-safety > into two distinct parts: > UPCRI_SUPPORT_PTHREADS - the UPC runtime is thread-safe (to at least the > extent needed for the TotalView support) and calls thread-safe GASNet > UPCRI_UPC_PTHREADS - UPC threads are implemented as multiple pthreads per > process. > > I suspect, but cannot verify, that linking in the libraries intended for > TotalView support will get you most of the way to what you appear to need. > For instance, I believe (with 90%+ certainty) that MYTHREAD will not > utilize thread-specific data in the "tv" version of libupcr. If you are > lucky the "tv" libraries might be sufficient for what you want. If I (and > my peers here is\n Berkeley) were not so busy right now with several > deadlines rapidly approaching, I'd be interested in helping you at least > conduct the experiment to see if the "tv" libraries work for you. Having > support for the sort of hybrid programming you describe would be interesting > to us. So, I hope you can keep us up-to-date on any progress you make. > > I am not certain what you are referring to when you say the IB driver only > allows one snd/rcv buffer per process. We support upto > GASNET_NETWORKDEPTH_PP IB-level operations in flight to a given peer > (default 64), or GASNET_NETWORKDEPTH_TOTAL operations outstanding total > (default is computed from HCA resource limits), and will stall waiting for > at least one outstanding operation to complete when either limit is reached. > > -Paul > > > Lingyuan Wang wrote: > >> Greetings, >> I am concerning about the thread safety of UPC memory copy library >> functions, whether it is safe to call those routines in parallel from >> multiple threads? >> >> I use a multi-threading layer of Pthread on top of each UPC thread, at >> some points of the program I need to do all-to-all communications among all >> UPC threads. I am looking to call UPC memory copy functions directly from my >> pool of Pthreads in parallel, since the data is sliced locally (and it would >> be less efficient to pack the data for a bulk synchronized communication). >> However, I got segmentation fault when run more than one thread per thread >> pool, and the code works fine for single thread per pool cases. >> I am using the IBV conduit of Berkeley UPC 2.10, with PSHM enabled. I am >> aware the fact that InfiniBand driver allows one send/receive buffer per >> process. My further questions regarding the thread safety are, is is a >> network driver/conduit specific issue, or a general gasnet/BUPC runtime >> restriction? As I can not find any thread safety definition from the UPC >> spec, would it be possible to support it potentially? And how does the UPC >> native Pthead conduit handle it? Thanks in advance. >> >> -- >> Regards >> >> > > -- > Paul H. Hargrove PHHargrove_at_lbl_dot_gov > Future Technologies Group Tel: +1-510-495-2352 > HPC Research Department Fax: +1-510-486-6900 > Lawrence Berkeley National Laboratory > -- Regards