From: Dan Bonachea (bonachea_at_cs_dot_berkeley_dot_edu)
Date: Mon Nov 21 2005 - 23:55:52 PST
Hi Eric - I'm the udp-conduit expert.. I'm not sure why you're seeing that particular error message, although based on your message below I suspect you have inconsistent copies of the executable on the two nodes - the penguin27 output is "Hello World from thread 1 of 2" but the myth output is "Hello World" - which probably means the programs are not the same. Berkeley UPC requires all nodes to be running the *exact* same binary executable - if you lack a shared file system then exact copies are fine (although error-prone), but it's not OK to recompile one copy and not the others. Also, udp-conduit requires all copies of the executable to reside at the same absolute pathname on all clients - so make sure the copies are all mounted or mirrored to the same absolute path. Also, if the nodes may differ in things like shared libraries, you should probably link statically (upcc -Wl,-static) just to be safe. Give it another try once you're certain the same binary is present and working on all nodes. If it still fails, try appending "-v" to the upcrun line to see more details about the startup procedure and send us the complete output. Please also send the output of "uname -a" and "cat /proc/cpuinfo" on each node. Hope this helps... Dan At 02:35 PM 11/21/2005, Eric Frederich wrote: > > It is intersting to note that when the upchostsfile looks like > > > > 192.168.1.207 <http://192.168.1.207> > > 192.168.1.207 < http://192.168.1.207> > > 192.168.1.208 <http://192.168.1.208> > > > > and I run it with -n 2 it works fine and I see the following > > > > UPCR: UPC thread 0 of 2 on penguin27 (process 0 of 2, pid=12356) > > UPCR: UPC thread 1 of 2 on penguin27 (process 1 of 2, pid=12357) > > Hello World from thread 1 of 2 > > Hello World from thread 0 of 2 > > > > Also when I have the file say > > > > 192.168.1.208 <http://192.168.1.208> > > 192.168.1.208 <http://192.168.1.208> > > 192.168.1.207 <http://192.168.1.207> > > > > it works fine too and I see the following > > > > UPCR: UPC thread 0 of 2 on myth (process 0 of 2, pid=10447) > > UPCR: UPC thread 1 of 2 on myth (process 1 of 2, pid=10446) > > Hello World > > Hello World