Re: Defining block size during runtime

Date view	Thread view	Subject view	Author view	Attachment view

From: sainath l (ls.sainath_at_gmail_dot_com)
Date: Thu Jul 23 2009 - 23:24:29 PDT

Next message: Paul H. Hargrove: "Re: Defining block size during runtime"

Previous message: Paul H. Hargrove: "Re: Defining block size during runtime"
In reply to: Paul H. Hargrove: "Re: Defining block size during runtime"
Next in thread: Paul H. Hargrove: "Re: Defining block size during runtime"
Reply: Paul H. Hargrove: "Re: Defining block size during runtime"
Reply: Gary Funck: "Re: Defining block size during runtime"

Hi paul,

I have attached my code. The first iteration runs till the deallocation part
and then the code breaks.

*** Caught a fatal signal: SIGSEGV(11) on node 0/16
_pmii_daemon(SIGCHLD): PE 0 exit signal Segmentation fault
[NID 26]Apid 315852: initiated application termination


Also I would be very happy to know, if I want to write micro-benchmarks for
the collectives will using this datastructure be of any problem ? (overhead
incurred by using this datastructure)  Or should I just declare arrays
statically and use them.

In practice, in general, are the source and destination variables of
collective operations dynamically allocated ?  If yes will that degrade the
perfromance.

Thank you very much.

Cheers,
Sainath











On Fri, Jul 24, 2009 at 3:54 AM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>wrote:

> sainath,
>
>  There is no obvious reason why upc_free() would not work for Gary's
> datastructure.  Are you sure you are calling upc_free(a) from exactly one
> thread, and only after all threads have finished accessing the array?  Could
> you provide more information on how "it breaks"?
>
> -Paul
>
> sainath l wrote:
>
>> sorry. that was runtime
>>
>> Cheers
>> sainath
>>
>>
>> On Fri, Jul 24, 2009 at 12:45 AM, sainath l <ls.sainath_at_gmail_dot_com<mailto:
>> ls.sainath_at_gmail_dot_com>> wrote:
>>
>>    Its not possible  to deallocate memory using upc_free for the
>>    derived data type that is given in Gary's example .
>>    ALthough the code compiles without any noise during compile time
>>    it breaks.
>>    Could someone tell me as to why this is the case. And also is
>>    there a way to deallocate the memory for the array of structures
>>    in Gary's Example.
>>
>>    cheers,
>>    sainath
>>
>>
>>    On Fri, Jul 24, 2009 at 12:23 AM, sainath l <ls.sainath_at_gmail_dot_com
>>    <mailto:ls.sainath_at_gmail_dot_com>> wrote:
>>
>>        Hello ,
>>
>>        Thanks again Gary.
>>
>>
>>        Cheers,
>>        sainath
>>
>>
>>
>>
>>
>>
>>        On Thu, Jul 23, 2009 at 6:19 AM, Gary Funck <gary_at_intrepid_dot_com
>>        <mailto:gary_at_intrepid_dot_com>> wrote:
>>
>>            On 07/23/09 02:00:02, sainath l wrote:
>>            >    I am very much interested in knowing any workaround,
>>            if possible, for
>>            >    dynamically allocating an array with variable block
>>            size at runtime.
>>            >
>>            >    Lets say I want to know if it is possible to create
>>            the following array
>>            >    dynamically where N and M are some variables. If yes
>>            then how can we do
>>            >    it.
>>            >
>>            >    shared [M] int A[N][M];
>>
>>            Sainath,
>>            I'm not sure if this is what you're asking about, but
>>            attached is
>>            a program that uses a "trick" to ensure that each row of
>>            the array
>>            has affinity to a single thread, in a thread-cyclic fashion.
>>
>>            The trick is that by placing the row vector 'y' inside of
>>            struct, we ensure that y is allocated contiguously on a
>>            given thread.  And for each a[i+1] (based upon UPC's
>>            indexing rules) we know that it will be allocated on
>>            the next thread (in cyclic order) after thread 'i'.
>>
>>            $ upc alloc_row_struct.upc -o alloc_row_struct
>>            $ alloc_row_struct -n 4 4 5
>>            threadof a[0].y[0] = 0
>>            threadof a[0].y[1] = 0
>>            threadof a[0].y[2] = 0
>>            threadof a[0].y[3] = 0
>>            threadof a[0].y[4] = 0
>>            threadof a[1].y[0] = 1
>>            threadof a[1].y[1] = 1
>>            threadof a[1].y[2] = 1
>>            threadof a[1].y[3] = 1
>>            threadof a[1].y[4] = 1
>>            threadof a[2].y[0] = 2
>>            threadof a[2].y[1] = 2
>>            threadof a[2].y[2] = 2
>>            threadof a[2].y[3] = 2
>>            threadof a[2].y[4] = 2
>>            threadof a[3].y[0] = 3
>>            threadof a[3].y[1] = 3
>>            threadof a[3].y[2] = 3
>>            threadof a[3].y[3] = 3
>>            threadof a[3].y[4] = 3
>>
>>            Above '-n 4' indicates that the program will run on 4 threads.
>>            That number was chosen to agree with the value of N (also 4)
>>            given above, but in fact could be any number.
>>
>>            Whether this is the best method, or even a recommended
>>            practice,
>>            for accomplishing your objective, I'm not sure.  Perhaps
>>            others
>>            on the list can offer some comment or suggest alternative
>>            methods?
>>
>>            - Gary
>>
>>
>>
>>
>>
>
> --
> Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
> Future Technologies Group                 Tel: +1-510-495-2352
> HPC Research Department                   Fax: +1-510-486-6900
> Lawrence Berkeley National Laboratory
>

application/octet-stream attachment: collectives.upc

Next message: Paul H. Hargrove: "Re: Defining block size during runtime"

Previous message: Paul H. Hargrove: "Re: Defining block size during runtime"
In reply to: Paul H. Hargrove: "Re: Defining block size during runtime"
Next in thread: Paul H. Hargrove: "Re: Defining block size during runtime"
Reply: Paul H. Hargrove: "Re: Defining block size during runtime"
Reply: Gary Funck: "Re: Defining block size during runtime"

Date view	Thread view	Subject view	Author view	Attachment view