From: Jason Beech-Brandt (jason_at_ahpcrc_dot_org)
Date: Fri Apr 07 2006 - 14:18:49 PDT
Hi,
I'm having some trouble with Berkeley upc-2.2.2 on an SGI Altix/IA64.  
I'm using the remote translator and Intel 9.0 compilers for the 
backend.  I've attached a short UPC program, upc_test.upc, which 
contains an array of shared pointers to "Node" data structures.  When I 
access this data through a private pointer I get some unexpected results.
Executing...
upcrun -n 4 ./upc_test
Thread 0 has nodes[0].x[0] equal to 0.000
Thread 0 has nodes[1].x[0] equal to 0.000
Thread 0 has nodes[2].x[0] equal to 0.000
Thread 0 has nodes[3].x[0] equal to 0.000
Thread 0 has nodes[4].x[0] equal to 0.000
Thread 1 has nodes[0].x[0] equal to 1.000
Thread 1 has nodes[1].x[0] equal to 0.000
Thread 1 has nodes[2].x[0] equal to 0.000
Thread 1 has nodes[3].x[0] equal to 0.000
Thread 1 has nodes[4].x[0] equal to 0.000
Thread 2 has nodes[0].x[0] equal to 2.000
Thread 2 has nodes[1].x[0] equal to 0.000
Thread 2 has nodes[2].x[0] equal to 0.000
Thread 2 has nodes[3].x[0] equal to 0.000
Thread 2 has nodes[4].x[0] equal to 0.000
Thread 3 has nodes[0].x[0] equal to 3.000
Thread 3 has nodes[1].x[0] equal to 0.000
Thread 3 has nodes[2].x[0] equal to 0.000
Thread 3 has nodes[3].x[0] equal to 0.000
Thread 3 has nodes[4].x[0] equal to 0.000
which I believe is incorrect.  I believe it should output
Thread 0 has nodes[0].x[0] equal to 0.000
Thread 0 has nodes[1].x[0] equal to 0.000
Thread 0 has nodes[2].x[0] equal to 0.000
Thread 0 has nodes[3].x[0] equal to 0.000
Thread 0 has nodes[4].x[0] equal to 0.000
Thread 1 has nodes[0].x[0] equal to 1.000
Thread 1 has nodes[1].x[0] equal to 1.000
Thread 1 has nodes[2].x[0] equal to 1.000
Thread 1 has nodes[3].x[0] equal to 1.000
Thread 1 has nodes[4].x[0] equal to 1.000
Thread 2 has nodes[0].x[0] equal to 2.000
Thread 2 has nodes[1].x[0] equal to 2.000
Thread 2 has nodes[2].x[0] equal to 2.000
Thread 2 has nodes[3].x[0] equal to 2.000
Thread 2 has nodes[4].x[0] equal to 2.000
Thread 3 has nodes[0].x[0] equal to 3.000
Thread 3 has nodes[1].x[0] equal to 3.000
Thread 3 has nodes[2].x[0] equal to 3.000
Thread 3 has nodes[3].x[0] equal to 3.000
Thread 3 has nodes[4].x[0] equal to 3.000
which is what the gcc-upc-3.4.4 compiler gives on the same machine 
(along with cray-cc on the X1).  If I remove the structure element of 
the code, and simply use an array, everything is fine.  Likewise, if I 
remove the array element from the structure, leaving a structure 
containing only a double, I get what I expect.
I'm not sure what's going on here, or even exactly where the problem is, 
but any input is appreciated.
Thanks,
Jason
#include <stdio.h>
#include <upc.h>
#define MAXP 10
#define NUME 5
typedef struct Node_rec {
   double x[3];
} Node;
shared [] Node *nodesSH[MAXP];
Node *nodes;
shared [] Node *shared nodesTMP[THREADS];
int main(void){
   int ip,i;
   
   nodesSH[MYTHREAD] = (shared [] Node *) upc_alloc(NUME * sizeof(Node));
   
   nodes = (Node *) nodesSH[MYTHREAD];
   
   upc_barrier;
   nodesTMP[MYTHREAD] = nodesSH[MYTHREAD];
   upc_barrier;
   for (ip = 0; ip < THREADS; ip++)
      if (ip != MYTHREAD)
         nodesSH[ip] = nodesTMP[ip];
   upc_barrier;  
   if (MYTHREAD == 0){
      for (ip = 0; ip < THREADS; ip++)
         for (i = 0; i < NUME; i++){
            nodesSH[ip][i].x[0] = 1.0 * ip;
            nodesSH[ip][i].x[1] = 2.0 * ip;
            nodesSH[ip][i].x[2] = 3.0 * ip;
         }
   }
   upc_barrier;
   for (ip = 0; ip < THREADS; ip++){
      upc_barrier;
      if ( ip == MYTHREAD)
         for (i = 0; i < NUME; i++)
            printf("Thread %d has nodes[%d].x[0] equal to %4.3f\n", MYTHREAD,i,nodes[i].x[0]);
   }
   upc_barrier;
   upc_free(nodesSH[MYTHREAD]);
   upc_global_exit(0);
}