From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Mon Jan 14 2008 - 14:01:20 PST
Steven, Whatever problem you are encountering is beyond my own experience. Attached is a very small program that is intended to list the names of your HCAs as queried from the OpenIB verbs interface. Compile as follows, substituting your correct paths for "/opt/ofed" (and possibly adding -ldl): $ cc -o ibvls ibvls.c -I/opt/ofed/include -L/opt/ofed/lib64 -libverbs and run with no arguments: $ ./ibvls ibv_get_device_list: list=0x501eb0 num_hcas=1 HCA[0] = 'mthca0' If this "ibvls" returns a non-empty list of HCAs, then the OpenIB verbs support from QLogic is working and GASNet is somehow at fault. However, if this simple test program doesn't find any HCAs, then I suggest you contact QLogic (or whichever vendor provides support for your cluster) for help in getting this small test program working. Once this small test program works, I believe GASNet should probably work as well. -Paul Steven D. Vormwald wrote: > Paul H. Hargrove wrote: >> Steven, >> >> I know that the QLogic HCAs don't support the Mellanox VAPI interface. >> My instructions should have asked you to perform those steps in the >> "ibv-conduit" directory, as you have done, not "vapi-conduit". I >> apologize for any confusion. >> > I expected as much, but it never hurts to make sure. > >> You noted that when trying to troubleshoot the message "libibverbs: >> Warning: no userspace device-specific driver found for >> /sys/class/infiniband_verbs/uverbs0" you verified that "mthca.so" >> existed and as executable. However, it now occurs to me that this is >> the filename for Mellanox HCAs, and that for the QLogic HCAs you should >> be verifying that "ipathverbs.so" is present and executable. At least >> in ODED 1.0, this file is part of the "libipathverbs" RPM. >> > [sdvormwa@n1 ~]$ file /usr/lib/libipathverbs.so > /usr/lib64/libipathverbs.so > /usr/lib/libipathverbs.so: symbolic link to `libipathverbs-rdmav2.so' > /usr/lib64/libipathverbs.so: symbolic link to `libipathverbs-rdmav2.so' > [sdvormwa@n1 ~]$ file /usr/lib/libipathverbs-rdmav2.so > /usr/lib64/libipathverbs-rdmav2.so > /usr/lib/libipathverbs-rdmav2.so: ELF 32-bit LSB shared object, > Intel 80386, version 1 (SYSV), not stripped > /usr/lib64/libipathverbs-rdmav2.so: ELF 64-bit LSB shared object, AMD > x86-64, version 1 (SYSV), not stripped > [sdvormwa@n1 ~]$ ls -l /usr/lib/libipathverbs-rdmav2.so > /usr/lib64/libipathverbs-rdmav2.so > -rwxr-xr-x 1 root root 67843 Jun 18 2007 > /usr/lib64/libipathverbs-rdmav2.so > -rwxr-xr-x 1 root root 57606 Jun 18 2007 > /usr/lib/libipathverbs-rdmav2.so > > One thing that seems odd to me is that all the other infiniband > libraries are 64-bit only, and installed in /usr/ofed/lib64, while > libipathverbs has both 32 and 64-bit versions installed in /lib[64]. > >> I'd also like to see the output from "ls -l >> /sys/class/infiniband_verbs/uverbs0/" >> > [sdvormwa@gilbert UPC]$ ls -l /sys/class/infiniband_verbs/uverbs0/ > total 0 > -r--r--r-- 1 root root 4096 Jan 11 20:58 abi_version > -r--r--r-- 1 root root 4096 Jan 11 20:58 dev > lrwxrwxrwx 1 root root 0 Jan 11 20:58 device -> > ../../../devices/pci0000:00/0000:00:06.0/0000:0a:00.0 > lrwxrwxrwx 1 root root 0 Jan 11 20:58 driver -> > ../../../bus/pci/drivers/ib_ipath > -r--r--r-- 1 root root 4096 Jan 11 20:58 ibdev > > Steven Vormwald -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group HPC Research Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 #include <stdio.h> #include <infiniband/verbs.h> int main(void) { struct ibv_device **hca_list; int num_hcas; int i; num_hcas = 0; /* call will overwrite with actual count */ hca_list = ibv_get_device_list(&num_hcas); printf("ibv_get_device_list: list=%p num_hcas=%d\n", hca_list, num_hcas); if ((hca_list == NULL) || (num_hcas == 0)) { return 1; } for (i=0; i<num_hcas; ++i) { const char *hca_name = ibv_get_device_name(hca_list[i]); printf("HCA[%d] = '%s'\n", i, hca_name); } return 0; }