upc_all_reduce behaviour

From: Nikita Andreev (lestat_at_kemsu.ru)
Date: Sun Apr 11 2010 - 04:28:52 PDT

  • Next message: Paul H. Hargrove: "Re: upc_all_reduce behaviour"
    Hi Paul,
    Sorry for spamming the list. But I've got another question. I'm reading UPC Collective Operations Specifications 1.0 at the moment and upc_all_reduce section with its example confuses me a bit. 
    Questions that immediately comes to my mind:
    1. What is the point of 'result' variable if it's not used anywhere?
    2. Why B is a pointer? It has no memory allocated to it. So it will certainly end up in segmentation fault.
    I assume it's just the typos. More interesting things are:
    1. Why in figure 7 distribution of array 'D' starts from thread T1. I always thought that all distributions start from thread 0.
    2. If nelems/blk_size/THREADS > 1.0 (means that one or more threads receive more than one block of array) then how many one-sided communications will reduce incorporate? One root<-thread communication with each thread (so all blocks will be packed into one get) or one get for each thread's block?
    3. Does upc_all_reduce every time end up with one value on one thread (thread which dst has affinity to) or it may result in one value in each thread? I believe it is one value on one thread. But I took a look into book "UPC: Distributed Shared Memory Programming" and found the example (find it attached) where it works as in the second case. But I suppose they just confused everything in this example.
    Could you clarify this, Paul?
    Thank you for your time,

  • Next message: Paul H. Hargrove: "Re: upc_all_reduce behaviour"