From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Jan 08 2008 - 18:58:06 PST
Steven D. Vormwald wrote:
> Paul H. Hargrove wrote:
>> Steven,
>>
>> I am bcc:ing this reply to a former member of the Berkeley UPC team
>> who is now at QLogic. He may respond with additional information.
>>
>> I am the author of the ibv-conduit code in Berkeley UPC/GASNet. I am
>> afraid that I have not encountered the specific error you see.
>> However, I see that you are using the InifiPath adapters, which I am
>> not certain support a full implementation of the OpenIB verbs.
>> Certainly all the MPI implementations that support QLogic's adapters
>> have been modified to use their "PSM" interface rather than IB Verbs.
>>
>> The only thing that I can suggest is to ensure that running
>> "ibv_devinfo" produces output something like the following:
>>
>> $ /opt/ofed/bin/ibv_devinfo
>> hca_id: mthca0
>> fw_ver: 1.2.0
>> node_guid: 0005:ad00:0005:85a4
>> sys_image_guid: 0005:ad00:0005:85a7
>> vendor_id: 0x02c9
>> vendor_part_id: 25204
>> hw_ver: 0xA0
>> board_id: MT_0230000002
>> phys_port_cnt: 1
>> port: 1
>> state: PORT_ACTIVE (4)
>> max_mtu: 2048 (4)
>> active_mtu: 2048 (4)
>> sm_lid: 1
>> port_lid: 1010
>> port_lmc: 0x00
>>
>> If ibv_devinfo fails, then that means that the OpenIB verbs support
>> is not present. If you do get output like the above, but don't see
>> "PORT_ACTIVE" then there is probably some configuration problem. If
>> you do get output that indicates at least one ACTIVE port, then we
>> can start looking at GASNet details to figure where the problem lies.
>>
>> -Paul
>>
> Paul,
>
> Thank you for the prompt response. It looks like there are active
> ports on all the nodes.
>
> n1 $ ibv_devinfo
> hca_id: ipath0
> fw_ver: 0.0.0
> node_guid: 0011:7500:00ff:e309
> sys_image_guid: 0011:7500:00ff:e309
> vendor_id: 0x1fc1
> vendor_part_id: 16
> hw_ver: 0x2
> board_id: InfiniPath_QLE7140
> phys_port_cnt: 1
> port: 1
> state: PORT_ACTIVE (4)
> max_mtu: 2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 2
> port_lid: 3
> port_lmc: 0x00
>
> Steven Vormwald
Let us see what GASNet is seeing when it probes the hardware. Please
follow the following steps, sending the output of the final command:
$ cd [YOUR_BERKELY_UPC_BUILD_DIR]
$ cd dbg/gasnet/vapi-conduit
$ make testgasnet-seq
[...output omitted...]
$ env GASNET_TRACEMASK=C GASNET_TRACEFILE=stdout
./contrib/gasnetrun_vapi -n1 ./testgasnet | grep HCA
When things are working correctly, you should expect output roughly like
the following:
GASNet reporting enabled - tracing and statistical output directed to stdout
0 0.001157s> (C) Probing HCAs for active ports
0 0.001887s> (C) Probe found HCA 'mthca0'
0 0.001976s> (C) Probe found HCA 'mthca0', port 1
0 0.001985s> (C) Probe found 1 active port(s) on 1 HCA(s)
0 0.001997s> (C) vapi-conduit HCA properties (1 of 1) = {
0 0.002004s> (C) HCA id = 'mthca0'
0 0.002006s> (C) HCA vendor id = 0x2c9
0 0.002008s> (C) HCA vendor part id = 0x6274
0 0.002010s> (C) HCA hardware version = 0xa0
0 0.002012s> (C) HCA firmware version =
-Paul
--
Paul H. Hargrove PHHargrove_at_lbl_dot_gov
Future Technologies Group
HPC Research Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900