From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Apr 13 2010 - 11:30:23 PDT
To expand slightly on what Costin said...
In UPC blocksize is a property of types, not a property of memory.
A type has one well defined blocksize while the memory it references may
be "compatible" with many possible blocksizes.
For example, consider the following declaration
shared [2] int A[4*THREADS];
This array holds 4 ints per thread. Elements A[0] and A[1] are the
first two ints on thread 0, then A[2] and A[3] are the first two of
those on thread 1. That continues though all the threads until element
A[2*THREADS] will access the third int stored on thread 0. However, if I do
shared int *p = (shared int*)A;
then p has the default cyclic layout. So, p[0] is the same as A[0] but
p[1] is going to reference the first element on thread 1: the same
memory as A[2] rather than A[1]. This casting to a pointer of
different block size is perfectly legal and potentially useful, though
potentially confusing as well.
Your code has done something similar to my example above, except
replacing the statically allocated array A with dynamically allocated
memory.
The memory allocated in your code *is* "compatible" with the desired
declaration, but is equally compatible with any other declaration that
uses the same memory per-thread.
As Costin points out your code has cast the pointer to "shared int *"
(and has produced exactly the results expected of that type).
-Paul
Costin Iancu wrote:
> Your program allocates memory and casts the result to (shared int *),
> which is correctly initialized and printed.
> The bupc_alloc seems to have allocated the right number of bits in the
> right places.
>
> Seems to me you should cast the result of the alloc to
> (shared [ASIZE] int *) ...
>
> Since alloc returns (shared void*) I can see how the wording in the
> spec is a little misleading.
>
> Costin
> On Apr 13, 2010, at 5:45 AM, Reinhold Bader wrote:
>
>> Hello,
>>
>> trying out upc_all_alloc() to obtain a distributed object with the
>> following program:
>>
>> #include <upc.h>
>> #include <stdlib.h>
>> #include <stdio.h>
>>
>> #define ASIZE 4
>>
>> int main(void) {
>> shared int *a;
>> int i, j, q;
>>
>> a = (shared int *) upc_all_alloc(THREADS,ASIZE*sizeof(int));
>> for (i=0; i<ASIZE; i++) {
>> a[MYTHREAD*ASIZE+i] = MYTHREAD;
>> }
>> upc_barrier;
>> if (MYTHREAD == 0) {
>> for (q=0; q<THREADS; q++) {
>> for (i=0; i<ASIZE; i++) {
>> j = upc_threadof(&a[q*ASIZE+i]);
>> printf("a[%d][%d] is on thread %d with value %d\n",q,i,\
>> j,a[q*ASIZE+i]);
>> }
>> }
>> }
>> upc_barrier;
>> upc_free(a);
>> return 0;
>> }
>>
>> I get the following output when running with 12 Threads:
>> a[0][0] is on thread 0 with value 0
>> a[0][1] is on thread 1 with value 0
>> a[0][2] is on thread 2 with value 0
>> a[0][3] is on thread 3 with value 0
>> a[1][0] is on thread 4 with value 1
>> a[1][1] is on thread 5 with value 1
>> a[1][2] is on thread 6 with value 1
>> a[1][3] is on thread 7 with value 1
>> ...
>>
>> This seems to contradict the specification which says
>> "The upc all alloc function allocates shared space compatible with
>> the following
>> declaration:
>> shared [nbytes] char[nblocks * nbytes]."
>>
>> It appears the BUPC implementation performs
>> shared [1] char[nblocks * nbytes]
>> instead.
>>
>> Regards
>> Reinhold
>>
>>
>>
>
> --
> Costin C. Iancu
> Phone: 510-495-2122
>
> Future Technologies Group
> Fax: 510-486-6900
> Lawrence Berkeley National Laboratory
>
>
>
--
Paul H. Hargrove PHHargrove_at_lbl_dot_gov
Future Technologies Group Tel: +1-510-495-2352
HPC Research Department Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory