Re: upc with mpich2

From: David J. Biesack (David.Biesack_at_sas_dot_com)
Date: Wed May 07 2008 - 13:16:46 PDT

  • Next message: Paul H. Hargrove: "Re: upc with mpich2"
    > Date: Wed, 07 May 2008 12:23:47 -0700
    > From: "Paul H. Hargrove" <PHHargrove_at_lbl_dot_gov>
    > CC: upc-users_at_lbl_dot_gov
    > 
    > David,
    > 
    >   Based on your description of the problem, it looks like the correct
    > "mpicc" *is* getting used, but the incorrect "mpirun". 
    
    I still believe it is using the wrong mpicc.
    The upcc -v output pretty clearly runs /usr/bin/mpicc :
    
    > >   /usr/bin/mpicc   -o 'cpi' ,,,,
    
    and not /acl/usr/local/mpich2/bin/mpicc as configured (or from my path or MPICC)
    
    > I believe you simply need to reconfigure adding
    > MIRUN_CMD=/acl/usr/local/mpich2/bin/mpirun to the configure command.
    >   Please let us know if that does not solve your problem.
    > 
    > -Paul
    
    Thanks for the tip. I did not see any information about setting
    MPIRUN_CMD in http://upc.lbl.gov/download/dist/INSTALL
    and it is not mentioned in http://upc.lbl.gov/docs/user/upcrun.html either.
    Perhaps someone can add mention of it?
    
    I tried setting 
    
      MPIRUN_CMD=/acl/usr/local/mpich2/bin/mpirun
    
    as an environment variable. I got a diagnostic telling me that I must
    include %P and %A and %N in the command, so I tried:
    
      $ MPIRUN_CMD="/acl/usr/local/mpich2/bin/mpirun -np %N '%P' '%A'"
      $ export MPIRUN_CMD
    
    I'm closer; this tries to run now on four nodes, and I get four
    diagnostics about LAM/MPI not running. Thus, I think this still
    indicates bindings introduced by /usr/bin/mpicc instead
    of /acl/usr/local/mpich2/bin/mpicc  :
    
      $ upcrun -np 4 cpi
      -----------------------------------------------------------------------------
      -----------------------------------------------------------------------------
    
      It seems that there is no lamd running on the host acl211.unx.sas.com.
    
      This indicates that the LAM/MPI runtime environment is not operating.
      The LAM/MPI runtime environment is necessary for MPI programs to run
      (the MPI program tired to invoke the "MPI_Init" function).
    
      Please run the "lamboot" command the start the LAM/MPI runtime
      environment.  See the LAM/MPI documentation for how to invoke
      "lamboot" across multiple machines.
      -----------------------------------------------------------------------------
    
      It seems that there is no lamd running on the host acl211.unx.sas.com.
    
      This indicates that the LAM/MPI runtime environment is not operating.
      The LAM/MPI runtime environment is necessary for MPI programs to run
      (the MPI program tired to invoke the "MPI_Init" function).
    
      Please run the "lamboot" command the start the LAM/MPI runtime
      environment.  See the LAM/MPI documentation for how to invoke
      "lamboot" across multiple machines.
      -----------------------------------------------------------------------------
      -----------------------------------------------------------------------------
    
      It seems that there is no lamd running on the host acl210.
    
      This indicates that the LAM/MPI runtime environment is not operating.
      The LAM/MPI runtime environment is necessary for MPI programs to run
      -----------------------------------------------------------------------------
    
      It seems that there is no lamd running on the host acl210.
    
      This indicates that the LAM/MPI runtime environment is not operating.
      The LAM/MPI runtime environment is necessary for MPI programs to run
      (the MPI program tired to invoke the "MPI_Init" function).
    
      Please run the "lamboot" command the start the LAM/MPI runtime
      environment.  See the LAM/MPI documentation for how to invoke
      "lamboot" across multiple machines.
      -----------------------------------------------------------------------------
      (the MPI program tired to invoke the "MPI_Init" function).
    
      Please run the "lamboot" command the start the LAM/MPI runtime
      environment.  See the LAM/MPI documentation for how to invoke
      "lamboot" across multiple machines.
      -----------------------------------------------------------------------------
      $ 
    
    I also changed the ./configure options to specify the mpich2 mpirun
    
      configure CC=cc CXX=c++ MPI_CC=/acl/usr/local/mpich2/bin/mpicc MPIRUN_CMD="/acl/usr/local/mpich2/bin/mpirun -np %N '%P' '%A'"
    
    and rebuilt, but I think mpicc is still incorrect; I get the same errors above.
    
    (Note that on my first attempt, I ran
    
      configure CC=cc CXX=c++ MPI_CC=mpicc
    
    and that appeared to be ignored as well, instead it ran /usr/bin/mpicc.)
    
    > -- 
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group
    > HPC Research Department                   Tel: +1-510-495-2352
    > Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    
    -- 
    David J. Biesack     SAS Institute Inc.
    (919) 531-7771       SAS Campus Drive
    http://www.sas.com   Cary, NC 27513
    

  • Next message: Paul H. Hargrove: "Re: upc with mpich2"