how to synchronise a subrange of all the threads

From: Venelin Mitov (vmitov_at_gmail_dot_com)
Date: Thu Jan 06 2005 - 11:37:30 PST

  • Next message: Marc Gonzalez-Sigler: "Re: Bleeding edge"
    I need to write quicksort in upc. Here is the pseudocode of the parallelised 
    quicksort algorithm:
    
    shared[N/THREADS] int P[N]; //the array to sort
    
    //sorts P[begin..end-1] using threads[t0..tn)
    void quicksort(int begin, int end, int t0, int tn){
    
    1: if only one thread (tn - t0 == 1) call the serial qsort on
    P[begin..end]; return;
    //more than one thread
    2: t0 selects one of the elements P[begin..end] as pivot
    3: each thread t0..tn-1 rearranges it's local part of P[begin..end]
    storing the elements smaller than pivot to the left and the larger or
    equal elements to the right
    4: global rearrangement (divide P[begin..end] on two parts - smallers
    than pivot to the left , larger than pivot to the right:
    4.1:once the local rearrangement is done by all threads the positions
    of the smaller and larger parts in the "global" array (P[begin..end])
    are calculated.
    4.2: each thread places its smaller and larger parts approprietely in
    the global array
    
    5. let P[begin..send) are the elements smaller than pivot and
    P[send..end) are the elements larger or equal to pivot.
    let roughly said t0,.., tn0-1 are the threads in affinity with the
    smaller (left) part
    P[begin..send), and tn0...tn-1 are the trheads in affinity with the
    larger(right) part -
    P[send..end].
    if(MYTHREAD <= tn0)
       quicksort(begin, send, t0, tn0);
    else
       quicksort(send, end, tn0, tn);
    }
    
    There are synchronisation points for threads t0..tn-1 between the steps. 
    For example all threads t1..tn-1 must wait until t0 calculates the
    pivot (step 2);
    the global rearrangement (step 4) cannot start until all threads have finished
    the local rearrangement;
    The recursive call cannot be made until the global rearrangement is done.
    
    The problem is that t0..tn-1 is a subrange of 0..THREADS-1, so
    upc_barrier cannot be used.
    Can you propose me an effective solution (maybe something with
    upc_locks which works like "upc_barrier for threads t0..tn-1")?
    
    Regards.
    Venelin Mitov
    

  • Next message: Marc Gonzalez-Sigler: "Re: Bleeding edge"