Re: Problem running on a cluster & performance...

From: Samy Bahra (sbahra_at_gwu_dot_edu)
Date: Tue Sep 19 2006 - 20:42:42 PDT

  • Next message: Jason Duell: "Re: Problem running on a cluster & performance..."
    Hi Konstantin,
    
    First, you ask, "f I compile my program with -pthreads=1 and then without -pthreads at all (so, in both cases there is only one thread) I see
    difference in performance." Understand threads will run only with locality. Depending on the scheduler of your Linux machines, the threads will be running in the same time-slice. So, for example, if your benchmark is CPU-bound, it could be that threads are saturating the time-slice. For things that are really CPU bound you might as well be spawning new processes unless you are providing a fair spread across CPUs (which is the default behavior of the run-time as far as I understand). Also, note that if the threads are running locally on the same process malloc() itself will have a lot of lock contention. dlmalloc is not known for its scalability with threads so much. If you want to avoid too much contention as you scale with threads please take a look at "jemalloc", a new malloc implementation FreeBSD is using optimized for threaded applications. A paper describing it is available at http://www.bsdcan.org/2006/papers/jemalloc.pdf 
    
    This is something I would like to look into for the future for UPC's memory allocation stubs (LD_PRELOAD the run-time?).
    
    Not too sure of the GASNet issue.
    
    Regards.
    --
    Samy Al Bahra
      `------ http://samy.kerneled.org/
    
    
    ----- Original Message -----
    From: Konstantin Kleisouris <kkonst_at_cs_dot_rutgers_dot_edu>
    Date: Tuesday, September 19, 2006 9:07 pm
    Subject: Problem running on a cluster & performance...
    To: upc-users_at_lbl_dot_gov
    
    
    > Hi everyone,
    > 
    >    I have two concerns with berkely UPC.
    >    I am trying to run a UPC program on a cluster of linux machines.
    > However, when I type the command:
    > > upcrun -n 8 a.out 
    > 
    > I get the message you see below (see after the dashed line). I have
    > compiled my program with -pthreads=4. I really cannot figure out what
    > the problem is. I have generated ssh keys so that when you ssh from one
    > machine to another you don't have to type your password. Also, I have
    > set the UPC_NODES variable to a list of machines in the cluster. I
    > believe that the program (a.out) does not even start executing, because
    > I am supposed to give some arguments to it, but it does not ask me for
    > them (as it should). 
    > 
    >     Also, if I compile my program with -pthreads=1 and then without
    > -pthreads at all (so, in both cases there is only one thread) I see
    > difference in performance. In the first case (-pthreads=1) the program
    > is slower than if I don't use -pthreads at all. I noticed that even
    > portions of the UPC program where threads access only private data (for
    > instance arrays that have been generated with malloc) take longer to
    > execute. I am measuring time with bupc_ticks_now() and
    > bupc_ticks_to_us(). Does anyone now why? Even if I do -pthreads=1 and
    > -T=1 this is still slower than if I don't use -pthreads at all.
    > 
    > Sincerely,
    > Kosta
    > 
    > 
    > 
    > ----------------------------------------------------------------
    > 	
    > AMUDP sendPacket returning an error code: AM_ERR_RESOURCE (Problem with
    > requested resource)
    >   from function sendPacket
    >   at amudp_reqrep.cpp:93
    >   reason: Invalid argument
    > AMUDP AMUDP_RequestGeneric returning an error code: AM_ERR_RESOURCE
    > (Problem with requested resource)
    >   at amudp_reqrep.cpp:1200
    > 
    > GASNet gasnetc_AMRequestShortM encountered an AM Error:
    > AM_ERR_RESOURCE(3)
    >   at
    > /home/kkonst/UPC/berkeley_upc-2.2.2/gasnet/udp-conduit/gasnet_core.c:564
    > GASNet gasnetc_AMRequestShortM returning an error code:
    > GASNET_ERR_RESOURCE (Problem with requested resource)
    >   at
    > /home/kkonst/UPC/berkeley_upc-2.2.2/gasnet/udp-conduit/gasnet_core.c:568
    > *** FATAL ERROR:
    > GASNet encountered an error: GASNET_ERR_RESOURCE(3)
    >   while calling: gasnet_AMRequestShort4(peer,
    > gasneti_handleridx(gasnete_ambarrier_notify_reqh), phase, 0, id, flags)
    >   at gasnete_barrier_notify() at
    > /home/kkonst/UPC/berkeley_upc-2.2.2/gasnet/extended-ref/gasnet_extended_refbarrier.c:197
    > *** Caught a fatal signal: SIGABRT(6) on node 0/2
    

  • Next message: Jason Duell: "Re: Problem running on a cluster & performance..."