RE: a problem of "unable to open any HCA ports" when upcrun a program

From: 卢兴敬 (xingjinglu_at_gmail_dot_com)
Date: Wed Oct 29 2008 - 15:51:39 PST

  • Next message: luxingjing: "RE: a problem of "unable to open any HCA ports" when upcrun a program"
    Paul,
      I ran the "vstat" in my UPC_NODES, and it shows as follows:
    1 HCA found:
            hca_id=InfiniHost0
            pci_location={BUS=0x04,DEV/FUNC=0x00}
            vendor_id=0x02C9
            vendor_part_id=0x5A44
            hw_ver=0xA1
            fw_ver=3.5.0
            PSID=MT_0030000001
            num_phys_ports=2
                    port=1
                    port_state=PORT_ACTIVE
                    sm_lid=0x0009
                    port_lid=0x0041
                    port_lmc=0x00
                    max_mtu=2048
    
                    port=2
                    port_state=PORT_DOWN
                    sm_lid=0x0000
                    port_lid=0x0042
                    port_lmc=0x00
                    max_mtu=2048
    I am not familiar with the network, and I think the only port 1 is active.
    So I tried: ssh 12.11.11.7 -D 1  , the results as follows mean I didn't have
    the right to do so.
    -----------------------------------------------------
    autopar@gnode8:~> ssh 12.11.11.7 -D 1
    Privileged ports can only be forwarded by root.
    ------------------------------------------------
    And I look for more information about infiniband now, so do you think it is
    the lack of proper active port causes the problem ?
    
    Thank you, wish your reply!
    -----Original Message-----
    From: Paul H. Hargrove [mailto:hargrove_at_hpcrd_dot_lbl_dot_gov] 
    Sent: Thursday, October 30, 2008 5:44 AM
    To: 卢兴敬
    Subject: a problem of "unable to open any HCA ports" when upcrun a program
    
    Email to [email protected] failed, so I am resending to your gmail 
    address:
    
    This error indicates that one or more nodes faild to locate an InfiniBand
    adapter.
    
    It is possible that the InfiniBand hardware and/or software has not been
    setup on one or more nodes. You should see the message you quote once
    for each failing attempt. If you see less than 32 instances, then it is
    possible that at least one node does have a working InfiniBand
    configuration.
    
    On each of the machines in you UPC_NODES, try running the "vstat"
    utility to see if it reports at least one "PORT_ACTIVE" line on each.
    
    It is also possible, that the InfiniBand libraries are present but there
    is no hardware at all.
    
    -Paul
    

  • Next message: luxingjing: "RE: a problem of "unable to open any HCA ports" when upcrun a program"