Re: Dynamic 2D Allocation

Date view	Thread view	Subject view	Author view	Attachment view

From: QMar=EDa_J._Mart=EDn=22?= (maria.martin.santamaria_at_udc.es)
Date: Thu Dec 03 2009 - 05:44:33 PST

Next message: Andreev Nikita: "Clock synchronization among threads in BUPC"

Previous message: Oliver Perks: "Re: Dynamic 2D Allocation"
In reply to: Oliver Perks: "Re: Dynamic 2D Allocation"

"[ ]"  specifies an indefinite block size. All the array elements  
should have affinity to the same thread. In the case of upc_alloc, the  
block size of the space allocated is always the indefinite block size.
If the shared array is declared with indefinite block size, the result  
of the pointer-to-shared arithmetic is identical to  normal C pointers.

As regards the performance optimization .... A generic pointer-to- 
shared contains three fields: thread, block address and phase. When  
performing pointer arithmetic on a pointer-to-shared all three fields  
will be updated, making the operation slower than private pointer  
arithmetic. The Berkeley UPC Compiler implements an optimization  
called �phaseless� pointers for the common special case of cyclic and  
indefinite pointers. Cyclic pointers have a block size of one, and  
their phase is thus always zero; Indefinite pointers have a block size  
of zero, and their phase is also defined to zero since all elements  
belong to the same UPC thread. Cyclic and indefinite pointers are thus  
�phaseless�, and the compiler exploits this knowledge to schedule more  
efficient operations for them  (see http://www.gwu.edu/~upc/publications/performance.pdf 
   for more details).

Regards,

Mar�a

El 03/12/2009, a las 11:37, Oliver Perks escribi�:

> Thank you for your reply.
> This works much better. I had actually "fixed" the problem by using:
>
> shared [UPC_MAX_BLOCK_SIZE] int *shared * a;
>
> Your solution provides much better performance so thank you, but I  
> am still confused as to what this then uses as the block size?
>
> Regards
> Oliver
>
> Mar�a J. Mart�n wrote:
>> The a pointer  is incorrectly declarated.
>>
>> Try:
>>
>> shared[] int *shared * a;
>>
>> a = (shared[] int *shared *)upc_all_alloc(10,sizeof(shared int*));
>>
>> Regards,
>>
>> Mar�a
>>
>>
>>
>> El 02/12/2009, a las 11:25, Oliver Perks escribi�:
>>
>>> I have been trying to get a simple example working where by a 2D  
>>> array is striped across multiple processors. Where each column is  
>>> placed on a different processor in a round robin fashion.
>>> I assumed that this would be achieved by the code provided by Ben,  
>>> but the results suggest otherwise. Can anyone shine some light on  
>>> what I would have considered a rather simple problem.
>>>
>>>
>>> shared int *shared * a;
>>>
>>> a = (shared int *shared *)upc_all_alloc(10,sizeof(shared int*));
>>> upc_forall(int i = 0; i < 10; i++; i)
>>> {
>>>  a[i] = upc_alloc(10*sizeof(shared int));
>>>  for(int j = 0; j < 10; j++)
>>>  {
>>>     a[i][j] = i * j;
>>>     printf("Owner of %d - %d is %d\n", i, j, upc_threadof(&a[i] 
>>> [j]));
>>>  }
>>> }
>>> return 0;
>>>
>>>
>>> When run on 2 threads:
>>> I would expect this to put even columns on thread 0, and odd  
>>> columns on thread 1. Then each column be entirely constrained  
>>> within that thread.
>>>
>>> 0   1   0   1   0   1   .....
>>> 0   1   0   1   0   1   .....
>>> 0   1   0   1   0   1   .....
>>> 0   1   0   1   0   1   .....
>>> 0   1   0   1   0   1   .....
>>> .     .   .   .   .   .   .
>>>
>>> By what I actually get is that it is striping the column over the  
>>> processors.
>>>
>>> 0   1   0   1   0   1   .....
>>> 1    0   1   0   1   0    .....
>>> 0   1   0   1   0   1   .....
>>> 1    0   1   0   1   0    .....
>>> 0   1   0   1   0   1   .....
>>> .     .   .   .   .   .   .
>>>
>>> Any ideas.
>>> Regards Oliver
>>>
>>>
>>>
>>>
>>
>

Next message: Andreev Nikita: "Clock synchronization among threads in BUPC"

Previous message: Oliver Perks: "Re: Dynamic 2D Allocation"
In reply to: Oliver Perks: "Re: Dynamic 2D Allocation"

Date view	Thread view	Subject view	Author view	Attachment view