From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Apr 13 2010 - 11:30:23 PDT
To expand slightly on what Costin said... In UPC blocksize is a property of types, not a property of memory. A type has one well defined blocksize while the memory it references may be "compatible" with many possible blocksizes. For example, consider the following declaration shared [2] int A[4*THREADS]; This array holds 4 ints per thread. Elements A[0] and A[1] are the first two ints on thread 0, then A[2] and A[3] are the first two of those on thread 1. That continues though all the threads until element A[2*THREADS] will access the third int stored on thread 0. However, if I do shared int *p = (shared int*)A; then p has the default cyclic layout. So, p[0] is the same as A[0] but p[1] is going to reference the first element on thread 1: the same memory as A[2] rather than A[1]. This casting to a pointer of different block size is perfectly legal and potentially useful, though potentially confusing as well. Your code has done something similar to my example above, except replacing the statically allocated array A with dynamically allocated memory. The memory allocated in your code *is* "compatible" with the desired declaration, but is equally compatible with any other declaration that uses the same memory per-thread. As Costin points out your code has cast the pointer to "shared int *" (and has produced exactly the results expected of that type). -Paul Costin Iancu wrote: > Your program allocates memory and casts the result to (shared int *), > which is correctly initialized and printed. > The bupc_alloc seems to have allocated the right number of bits in the > right places. > > Seems to me you should cast the result of the alloc to > (shared [ASIZE] int *) ... > > Since alloc returns (shared void*) I can see how the wording in the > spec is a little misleading. > > Costin > On Apr 13, 2010, at 5:45 AM, Reinhold Bader wrote: > >> Hello, >> >> trying out upc_all_alloc() to obtain a distributed object with the >> following program: >> >> #include <upc.h> >> #include <stdlib.h> >> #include <stdio.h> >> >> #define ASIZE 4 >> >> int main(void) { >> shared int *a; >> int i, j, q; >> >> a = (shared int *) upc_all_alloc(THREADS,ASIZE*sizeof(int)); >> for (i=0; i<ASIZE; i++) { >> a[MYTHREAD*ASIZE+i] = MYTHREAD; >> } >> upc_barrier; >> if (MYTHREAD == 0) { >> for (q=0; q<THREADS; q++) { >> for (i=0; i<ASIZE; i++) { >> j = upc_threadof(&a[q*ASIZE+i]); >> printf("a[%d][%d] is on thread %d with value %d\n",q,i,\ >> j,a[q*ASIZE+i]); >> } >> } >> } >> upc_barrier; >> upc_free(a); >> return 0; >> } >> >> I get the following output when running with 12 Threads: >> a[0][0] is on thread 0 with value 0 >> a[0][1] is on thread 1 with value 0 >> a[0][2] is on thread 2 with value 0 >> a[0][3] is on thread 3 with value 0 >> a[1][0] is on thread 4 with value 1 >> a[1][1] is on thread 5 with value 1 >> a[1][2] is on thread 6 with value 1 >> a[1][3] is on thread 7 with value 1 >> ... >> >> This seems to contradict the specification which says >> "The upc all alloc function allocates shared space compatible with >> the following >> declaration: >> shared [nbytes] char[nblocks * nbytes]." >> >> It appears the BUPC implementation performs >> shared [1] char[nblocks * nbytes] >> instead. >> >> Regards >> Reinhold >> >> >> > > -- > Costin C. Iancu > Phone: 510-495-2122 > > Future Technologies Group > Fax: 510-486-6900 > Lawrence Berkeley National Laboratory > > > -- Paul H. Hargrove PHHargrove_at_lbl_dot_gov Future Technologies Group Tel: +1-510-495-2352 HPC Research Department Fax: +1-510-486-6900 Lawrence Berkeley National Laboratory