Re: Odd problem with shared array

From: Kathy Yelick (yelick_at_EECS_dot_Berkeley_dot_EDU)
Date: Wed Jan 28 2009 - 17:03:18 PST

  • Next message: Paul H. Hargrove: "Re: Odd problem with shared array"
    Just a correction --  Costin Iancu is still working on compiler bug  
    fixes, and we are also adding a new compiler person to the team later  
    this spring.  The rest of Paul's mail was correct-- please go ahead  
    with the bug report.
    
        Kathy
    
    ------------------------------------------------
    Katherine Yelick
    Lawrence Berkeley National Laboratory
    One Cyclotron Rd., MS 50B-4230
    Berkeley, CA  94720
    Email: kayelick_at_lbl_dot_gov
    Phone: 510-495-2431
    
    On Jan 26, 2009, at 11:18 PM, Paul H. Hargrove wrote:
    
    > Steven,
    >
    > Thanks for the bug report.  I am pretty sure you have found a bug in  
    > our UPC compiler.  However, there is nobody actively working on  
    > compiler bugs right now.  So, if you could please go to http://upc-bugs.lbl.gov/bugzilla/ 
    >  and enter your information in a bug report, that will ensure we  
    > don't loose track of it.  The initial bug entry page does not allow  
    > attachments, but once the initial report has been entered, you will  
    > have the option to attach files.
    >
    > Thanks,
    > -Paul
    >
    > Steven Vormwald wrote:
    >> Hello,
    >>
    >> I've come across an odd problem that seems to only come up with  
    >> structs with 2-dimensional arrays of size 1x1.  The attached code  
    >> provides an example of this.  When run with N=1 (and 4 threads),  
    >> the output is unexpectedly:
    >>       0       1
    >> 0       0       1
    >> 1       2       3
    >>
    >>       0       1
    >> 0       0       1
    >> 1       2       3
    >>
    >>       0       1
    >> 0       0       704643072
    >> 1       2752512 10752
    >>
    >> instead of
    >>
    >>       0       1
    >> 0       0       1
    >> 1       2       3
    >>
    >>       0       1
    >> 0       0       1
    >> 1       2       3
    >>
    >>       0       1
    >> 0       2       3
    >> 1       6       11
    >>
    >> Using any value of N other than 1 generates the correct output for  
    >> the number of threads.  Even more odd is when I enabled the  
    >> debugging output from the code:
    >>       0       1
    >> 0       0       1
    >> 1       2       3
    >>
    >>       0       1
    >> 0       0       1
    >> 1       2       3
    >>
    >> C[00].local_block[00][00] = 0 + 0 * 0
    >>                         = 0 + 0
    >>                         = 0
    >> C[00].local_block[00][00] = 0 + 1 * 2
    >>                         = 0 + 0
    >>                         = 0
    >> C[01].local_block[00][00] = 0 + 0 * 1
    >>                         = 0 + 0
    >>                         = 704643072
    >> C[01].local_block[00][00] = 704643072 + 1 * 3
    >>                         = 704643072 + 0
    >>                         = 704643072
    >> C[02].local_block[00][00] = 0 + 2 * 0
    >>                         = 0 + 0
    >>                         = 2752512
    >> C[02].local_block[00][00] = 2752512 + 3 * 2
    >>                         = 2752512 + 0
    >>                         = 2752512
    >> C[03].local_block[00][00] = 0 + 2 * 1
    >>                         = 0 + 0
    >>                         = 10752
    >> C[03].local_block[00][00] = 10752 + 3 * 3
    >>                         = 10752 + 0
    >>                         = 10752
    >>
    >>       0       1
    >> 0       0       704643072
    >> 1       2752512 10752
    >>
    >> Note that the values of A[] and B[] are printed correctly on the  
    >> first line, but the results of the multiplication and store in C[]  
    >> are incorrect.
    >>
    >> Changing the code to use floats or doubles instead of ints  
    >> generates similar problems.  However, if the arrays are allocated  
    >> dynamically with upc_all_alloc(), the program works correctly.  I  
    >> tested the code on versions 2.4 (mpi), 2.6 (smp,ibv), and 2.8  
    >> (smp,ibv) of the Berkeley UPC compiler, all of which produce the  
    >> same problem.  I haven't been able to test it on another machine,  
    >> so it might be a configuration issue, or a problem with the local C  
    >> compiler (gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)).
    >>
    >> Attached is the source code that was used, the output of 'upcc - 
    >> version' for each of the versions of the compiler used, as well as  
    >> the output run on 4 threads with N=1,2.  I fixed the order of the  
    >> output lines so they lined up properly, but otherwise did not  
    >> change the output.
    >>
    >> Steven Vormwald
    >
    >
    > -- 
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group                 Tel: +1-510-495-2352
    > HPC Research Department                   Fax: +1-510-486-6900
    > Lawrence Berkeley National Laboratory
    

  • Next message: Paul H. Hargrove: "Re: Odd problem with shared array"