From: Steven D. Vormwald (sdvormwa_at_mtu_dot_edu)
Date: Wed Feb 13 2008 - 12:24:55 PST
Paul H. Hargrove wrote: > Steven, > > Whatever problem you are encountering is beyond my own experience. > Attached is a very small program that is intended to list the names of > your HCAs as queried from the OpenIB verbs interface. Compile as > follows, substituting your correct paths for "/opt/ofed" (and possibly > adding -ldl): > $ cc -o ibvls ibvls.c -I/opt/ofed/include -L/opt/ofed/lib64 -libverbs > and run with no arguments: > $ ./ibvls > ibv_get_device_list: list=0x501eb0 num_hcas=1 > HCA[0] = 'mthca0' > > If this "ibvls" returns a non-empty list of HCAs, then the OpenIB verbs > support from QLogic is working and GASNet is somehow at fault. However, > if this simple test program doesn't find any HCAs, then I suggest you > contact QLogic (or whichever vendor provides support for your cluster) > for help in getting this small test program working. Once this small > test program works, I believe GASNet should probably work as well. > > -Paul Paul, After working with QLogic, we've gotten the ibvls program working. $ ./ibvls ibv_get_device_list: list=0x501ea0 num_hcas=1 HCA[0] = 'ipath0' However, GASNet is still not working with the cards. Running the tests that you mentioned earlier is now giving the error "Probe failed to open HCA 'ipath0'", so it is at least finding the card now. $ env GASNET_TRACEMASK=C GASNET_TRACEFILE=stdout ./contrib/gasnetrun_ibv -n1 ./testgasnet | grep HCA GASNet reporting enabled - tracing and statistical output directed to stdout 0 0.000895s> (C) Probing HCAs for active ports 0 0.002365s> (C) Probe failed to open HCA 'ipath0' GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem with requested resource) at /usr/local/src/berkeley_upc-2.6.0/gasnet/vapi-conduit/gasnet_core.c:986 reason: unable to open any HCA ports GASNet gasnet_init_GASNET_SEQFASTdebugtracestatssrclines returning an error code: GASNET_ERR_RESOURCE (Problem with requested resource) at /usr/local/src/berkeley_upc-2.6.0/gasnet/vapi-conduit/gasnet_core.c:1546 ERROR calling: gasnet_init(&argc, &argv) at: /usr/local/src/berkeley_upc-2.6.0/gasnet/tests/testgasnet.c:185 error: GASNET_ERR_RESOURCE (Problem with requested resource) 0 0.002413s> (C) Probe found 0 active port(s) on 0 HCA(s) gasnet_exit(): ERROR: signal 11 received during exit... goodbye. [initiating collective exit] Abort on node n1 due to MPI_Abort (type 2) $ Steven Vormwald