Problem running on a cluster & performance...

From: Konstantin Kleisouris (kkonst_at_cs_dot_rutgers_dot_edu)
Date: Tue Sep 19 2006 - 18:06:48 PDT

  • Next message: Samy Bahra: "Re: Problem running on a cluster & performance..."
    Hi everyone,
    
       I have two concerns with berkely UPC.
       I am trying to run a UPC program on a cluster of linux machines.
    However, when I type the command:
    > upcrun -n 8 a.out 
    
    I get the message you see below (see after the dashed line). I have
    compiled my program with -pthreads=4. I really cannot figure out what
    the problem is. I have generated ssh keys so that when you ssh from one
    machine to another you don't have to type your password. Also, I have
    set the UPC_NODES variable to a list of machines in the cluster. I
    believe that the program (a.out) does not even start executing, because
    I am supposed to give some arguments to it, but it does not ask me for
    them (as it should). 
    
        Also, if I compile my program with -pthreads=1 and then without
    -pthreads at all (so, in both cases there is only one thread) I see
    difference in performance. In the first case (-pthreads=1) the program
    is slower than if I don't use -pthreads at all. I noticed that even
    portions of the UPC program where threads access only private data (for
    instance arrays that have been generated with malloc) take longer to
    execute. I am measuring time with bupc_ticks_now() and
    bupc_ticks_to_us(). Does anyone now why? Even if I do -pthreads=1 and
    -T=1 this is still slower than if I don't use -pthreads at all.
    
    Sincerely,
    Kosta
    
    
    
    ----------------------------------------------------------------
    	
    AMUDP sendPacket returning an error code: AM_ERR_RESOURCE (Problem with
    requested resource)
      from function sendPacket
      at amudp_reqrep.cpp:93
      reason: Invalid argument
    AMUDP AMUDP_RequestGeneric returning an error code: AM_ERR_RESOURCE
    (Problem with requested resource)
      at amudp_reqrep.cpp:1200
    
    GASNet gasnetc_AMRequestShortM encountered an AM Error:
    AM_ERR_RESOURCE(3)
      at
    /home/kkonst/UPC/berkeley_upc-2.2.2/gasnet/udp-conduit/gasnet_core.c:564
    GASNet gasnetc_AMRequestShortM returning an error code:
    GASNET_ERR_RESOURCE (Problem with requested resource)
      at
    /home/kkonst/UPC/berkeley_upc-2.2.2/gasnet/udp-conduit/gasnet_core.c:568
    *** FATAL ERROR:
    GASNet encountered an error: GASNET_ERR_RESOURCE(3)
      while calling: gasnet_AMRequestShort4(peer,
    gasneti_handleridx(gasnete_ambarrier_notify_reqh), phase, 0, id, flags)
      at gasnete_barrier_notify() at
    /home/kkonst/UPC/berkeley_upc-2.2.2/gasnet/extended-ref/gasnet_extended_refbarrier.c:197
    *** Caught a fatal signal: SIGABRT(6) on node 0/2
    

  • Next message: Samy Bahra: "Re: Problem running on a cluster & performance..."