Hanging during upc_alloc()

From: Benjamin Byington (bbyingto_at_soe_dot_ucsc_dot_edu)
Date: Fri May 08 2009 - 22:23:06 PDT

  • Next message: Paul H. Hargrove: "Re: Hanging during upc_alloc()"
    Hello,
    
    So my question comes in two parts.  First, what is wrong with the toy code below? 
    (Besides the obvious infinite loop...).  When executing this code with two processors 
    on two separate nodes, somehow the tight loop thread 0 is performing is preventing 
    thread 1 from doing the memory allocation.  The first print statement is reached,
    but never the second.  If I either remove the loop, or simply switch things around 
    so that thread 1 is in the loop and thread 0 is trying to do the allocation, things 
    proceed as would be expected and the memory allocation is completed.  
    
    #include <upc.h>
    #include <stdio.h>
    
    int main( int argc, char** argv )
    {
        if(MYTHREAD == 0)
        {
            int len;
            while(1);
        }
        else if(MYTHREAD == 1)
        {
            fprintf(stderr, "Beginning memory allocation\n");
            shared void * t = upc_alloc(1000000);
            fprintf(stderr, "Finished memory allocation\n");
        }
    
        upc_barrier;
    
        return 0;
    }
    
    The second part of my question is: How should one approach doing event driven 
    programming in upc?  The above situation arose when I was trying to write a program 
    that used dynamic scheduling to control when various tasks get performed.  Thread 0 
    sits in a tight loop monitoring a set of flags for each of the worker processors, 
    and gives them new directions any time it detects one is available.  The worker 
    nodes also sit in a tight loop any time they are idle, monitoring another flag to 
    see if there is any more work available.  I took care to insure that all these 
    rapidly accessed flags were local to the processor sitting on them so as to avoid a 
    million tiny unnecessary messages, but as my first example demonstrates that doesn't 
    seem to be enough.  All the processors go through some setup code allocating various 
    shared data structures without a problem, but almost as soon as things enter the meat 
    of the program things hang.  Processor 0 hands off the first job to some worker 
    node, and since at this stage there are no other concurrent tasks until the first one
    finished, processor zero just ends up repeatedly checking all the flags waiting for 
    the job to be finished.  The worker node however never completes the task.  It always 
    manages to perform a malloc(), a upc_memget(), and a upc_free without a problem, but 
    the first time it hits a upc_alloc() the program just freezes.  (The freezing problem 
    goes away if I tell processor zero to just exit the loop and wait at a barrier, but 
    that of course is useless since now it can't detect or do anything once the first task 
    is done).  Is there a better way than my flags to take event driven action?  Is there 
    a reason processor 0 being in a tight loop affects the execution of other processors?  
    
    I just realized, this code works on my multicore laptop just fine, and while I presumed 
    the problem had to do with distributed memory verses shared memory, I figured I should 
    provide what details I can about the hardware this program is failing on in case there 
    is a key there...
    
    Thanks in advance!
    Ben
    
    Processor:  Dual core Opterons 2.2GHz (I only am using one core per node though)
    Network:  Infiniband (using vapi protocol)
    Output from upcc -version
    This is upcc (the Berkeley Unified Parallel C compiler), v. 2.8.0
    ----------------------+---------------------------------------------------------
     UPC Runtime          | v. 2.8.0, built on Feb  3 2009 at 14:21:28
    ----------------------+---------------------------------------------------------
     UPC-to-C translator  | v. 2.8.0, built on Feb  3 2009 at 14:08:02
                          | host jacin04 linux-x86_64/64
    ----------------------+---------------------------------------------------------
     Translator location  | /usr/common/ftg/upc/builds/stable/translator/install/ta
                          | rg
    ----------------------+---------------------------------------------------------
     networks supported   | udp mpi smp vapi
    ----------------------+---------------------------------------------------------
     default network      | vapi
    ----------------------+---------------------------------------------------------
     pthreads support     | available (if used, default is 2 pthreads per process)
    ----------------------+---------------------------------------------------------
     Configured with      | '--with-translator=/usr/common/ftg/upc/builds/stable/tr
                          | anslator/install/targ' '--enable-mpi' '--enable-vapi'
                          | '--with-multiconf=+opt,+dbg,+opt_inst,+dbg_gccupc,
                          | +opt_gccupc' '--with-vapi-spawner=mpi'
                          | '--prefix=/usr/common/ftg/upc/builds/stable/runtime/ins
                          | t/opt' '--with-multiconf-magic=opt'
                          | 'CC=/usr/common/usg/pathscale/3.2/bin/pathcc'
    ----------------------+---------------------------------------------------------
     Configure features   | berkeleyupc,upcr,gasnet,upc_collective,upc_io,
                          | upc_memcpy_async,upc_ptradd,upc_thread_distance,
                          | upc_tick,upc_sem,upc_dump_shared,upc_trace_printf,
                          | upc_trace_mask,upc_local_to_shared,upc_atomics,pupc,
                          | upc_memcpy_vis,nodebug,notrace,nostats,nogasp,
                          | segment_fast,os_linux,cpu_x86_64,cpu_64,cc_pathscale,
                          | packedsptr
    ----------------------+---------------------------------------------------------
     Configure id         | jacin04 Tue Feb  3 14:05:53 PST 2009 hargrove
    ----------------------+---------------------------------------------------------
     Binary interface     | 64-bit x86_64-unknown-linux-gnu
    ----------------------+---------------------------------------------------------
     Runtime interface #  | Runtime supports 3.0 -> 3.10: Translator uses 3.6
    ----------------------+---------------------------------------------------------
                          |  --- BACKEND SETTINGS (for vapi network) ---
    ----------------------+---------------------------------------------------------
     C compiler           | /usr/common/usg/pathscale/3.2/bin/pathcc
                          |   PATHSCALE/3.2/3.3.3 (SuSE Linux)
                          |   PathScale(TM) Compiler Suite: Version 3.2 Built on:
                          |   2008-06-16 16:41:38 -0700
                          |   GNU gcc version 3.3.1 (PathScale 3.2 driver)
    ----------------------+---------------------------------------------------------
     C compiler flags     | -O3 -Winline
    ----------------------+---------------------------------------------------------
     linker               | /usr/common/nsg/mvapich/pathscale/mvapich-0.9.5-mlx1.0.
                          | 3/bin/mpicc
                          |   PATHSCALE/3.2/3.3.3 (SuSE Linux)
                          |   PathScale(TM) Compiler Suite: Version 3.2 Built on:
                          |   2008-06-16 16:41:38 -0700
                          |   GNU gcc version 3.3.1 (PathScale 3.2 driver)
                          |   mpicc for 1.2.6 (release) of : 2004/08/04 11:10:38
    ----------------------+---------------------------------------------------------
     linker flags         | -O3 -Winline
                          | -L/usr/common/ftg/upc/builds/stable/runtime/inst/opt/li
                          | b -lupcr-vapi-seq -lumalloc
                          | -L/usr/common/ftg/upc/builds/stable/runtime/inst/opt/li
                          | b -L/usr/local/ibgd/driver/infinihost/lib64
                          | -lgasnet-vapi-seq -lvapi -lmtl_common -lmosal -lmpga
                          | -lpthread -lm
    ----------------------+---------------------------------------------------------
    jacin04 b/bbyingto> 
    

  • Next message: Paul H. Hargrove: "Re: Hanging during upc_alloc()"