Re: problems running UPC programs

From: Dan Bonachea (bonachea_at_cs_dot_berkeley_dot_edu)
Date: Mon Nov 21 2005 - 23:55:52 PST

  • Next message: Eric Frederich: "Re: problems running UPC programs"
    Hi Eric - I'm the udp-conduit expert..
    
    I'm not sure why you're seeing that particular error message, although based 
    on your message below I suspect you have inconsistent copies of the executable 
    on the two nodes - the penguin27 output is "Hello World from thread 1 of 2" 
    but the myth output is "Hello World" - which probably means the programs are 
    not the same.
    
    Berkeley UPC requires all nodes to be running the *exact* same binary 
    executable - if you lack a shared file system then exact copies are fine 
    (although error-prone), but it's not OK to recompile one copy and not the 
    others. Also, udp-conduit requires all copies of the executable to reside at 
    the same absolute pathname on all clients - so make sure the copies are all 
    mounted or mirrored to the same absolute path. Also, if the nodes may differ 
    in things like shared libraries, you should probably link statically (upcc 
    -Wl,-static) just to be safe.
    
    Give it another try once you're certain the same binary is present and working 
    on all nodes. If it still fails, try appending "-v" to the upcrun line to see 
    more details about the startup procedure and send us the complete output. 
    Please also send the output of "uname -a" and "cat /proc/cpuinfo" on each 
    node.
    
    Hope this helps...
    Dan
    
    At 02:35 PM 11/21/2005, Eric Frederich wrote:
    > > It is intersting to note that when the upchostsfile looks like
    > >
    > > 192.168.1.207 <http://192.168.1.207>
    > > 192.168.1.207 < http://192.168.1.207>
    > > 192.168.1.208 <http://192.168.1.208>
    > >
    > > and I run it with -n 2 it works fine and I see the following
    > >
    > > UPCR: UPC thread 0 of 2 on penguin27 (process 0 of 2, pid=12356)
    > > UPCR: UPC thread 1 of 2 on penguin27 (process 1 of 2, pid=12357)
    > > Hello World from thread 1 of 2
    > > Hello World from thread 0 of 2
    > >
    > > Also when I have the file say
    > >
    > > 192.168.1.208 <http://192.168.1.208>
    > > 192.168.1.208 <http://192.168.1.208>
    > > 192.168.1.207 <http://192.168.1.207>
    > >
    > > it works fine too and I see the following
    > >
    > > UPCR: UPC thread 0 of 2 on myth (process 0 of 2, pid=10447)
    > > UPCR: UPC thread 1 of 2 on myth (process 1 of 2, pid=10446)
    > > Hello World
    > > Hello World
    

  • Next message: Eric Frederich: "Re: problems running UPC programs"