Re: trouble using ibv-conduit

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Jan 08 2008 - 18:58:06 PST

  • Next message: Mike Kucera: "UPC support in Eclipse CDT"
    Steven D. Vormwald wrote:
    > Paul H. Hargrove wrote:
    >> Steven,
    >>
    >> I am bcc:ing this reply to a former member of the Berkeley UPC team 
    >> who is now at QLogic.  He may respond with additional information.
    >>
    >> I am the author of the ibv-conduit code in Berkeley UPC/GASNet.  I am 
    >> afraid that I have not encountered the specific error you see.  
    >> However, I see that you are using the InifiPath adapters, which I am 
    >> not certain support a full implementation of the OpenIB verbs.  
    >> Certainly all the MPI implementations that support QLogic's adapters 
    >> have been modified to use their "PSM" interface rather than IB Verbs.
    >>
    >> The only thing that I can suggest is to ensure that running 
    >> "ibv_devinfo" produces output something like the following:
    >>
    >> $ /opt/ofed/bin/ibv_devinfo
    >> hca_id: mthca0
    >>        fw_ver:                         1.2.0
    >>        node_guid:                      0005:ad00:0005:85a4
    >>        sys_image_guid:                 0005:ad00:0005:85a7
    >>        vendor_id:                      0x02c9
    >>        vendor_part_id:                 25204
    >>        hw_ver:                         0xA0
    >>        board_id:                       MT_0230000002
    >>        phys_port_cnt:                  1
    >>                port:   1
    >>                        state:                  PORT_ACTIVE (4)
    >>                        max_mtu:                2048 (4)
    >>                        active_mtu:             2048 (4)
    >>                        sm_lid:                 1
    >>                        port_lid:               1010
    >>                        port_lmc:               0x00
    >>
    >> If ibv_devinfo fails, then that means that the OpenIB verbs support 
    >> is not present.  If you do get output like the above, but don't see 
    >> "PORT_ACTIVE" then there is probably some configuration problem.  If 
    >> you do get output that indicates at least one ACTIVE port, then we 
    >> can start looking at GASNet details to figure where the problem lies.
    >>
    >> -Paul
    >>
    > Paul,
    >
    > Thank you for the prompt response.  It looks like there are active 
    > ports on all the nodes.
    >
    > n1 $ ibv_devinfo
    > hca_id: ipath0
    >        fw_ver:                         0.0.0
    >        node_guid:                      0011:7500:00ff:e309
    >        sys_image_guid:                 0011:7500:00ff:e309
    >        vendor_id:                      0x1fc1
    >        vendor_part_id:                 16
    >        hw_ver:                         0x2
    >        board_id:                       InfiniPath_QLE7140
    >        phys_port_cnt:                  1
    >                port:   1
    >                        state:                  PORT_ACTIVE (4)
    >                        max_mtu:                2048 (4)
    >                        active_mtu:             2048 (4)
    >                        sm_lid:                 2
    >                        port_lid:               3
    >                        port_lmc:               0x00
    >
    > Steven Vormwald
    
    Let us see what GASNet is seeing when it probes the hardware.  Please 
    follow the following steps, sending the output of the final command:
    
    $ cd [YOUR_BERKELY_UPC_BUILD_DIR]
    $ cd dbg/gasnet/vapi-conduit
    $ make testgasnet-seq
    [...output omitted...]
    $ env GASNET_TRACEMASK=C GASNET_TRACEFILE=stdout 
    ./contrib/gasnetrun_vapi -n1 ./testgasnet  | grep HCA
    
    When things are working correctly, you should expect output roughly like 
    the following:
    
    GASNet reporting enabled - tracing and statistical output directed to stdout
    0 0.001157s> (C) Probing HCAs for active ports
    0 0.001887s> (C) Probe found HCA 'mthca0'
    0 0.001976s> (C) Probe found HCA 'mthca0', port 1
    0 0.001985s> (C) Probe found 1 active port(s) on 1 HCA(s)
    0 0.001997s> (C) vapi-conduit HCA properties (1 of 1) = {
    0 0.002004s> (C)   HCA id                   = 'mthca0'
    0 0.002006s> (C)   HCA vendor id            = 0x2c9
    0 0.002008s> (C)   HCA vendor part id       = 0x6274
    0 0.002010s> (C)   HCA hardware version     = 0xa0
    0 0.002012s> (C)   HCA firmware version     =        
    
    
    -Paul
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Mike Kucera: "UPC support in Eclipse CDT"