Re: Defining block size during runtime

Date view	Thread view	Subject view	Author view	Attachment view

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Fri Jul 24 2009 - 15:00:58 PDT

Next message: sainath l: "Re: Defining block size during runtime"

Previous message: sainath l: "Re: Defining block size during runtime"
In reply to: sainath l: "Re: Defining block size during runtime"
Next in thread: sainath l: "Re: Defining block size during runtime"
Reply: sainath l: "Re: Defining block size during runtime"

I have run your code (I needed to provide a gettime.h) and did not see 
any errors.  I tried on both an x86 cluster with myrinet and on a CrayXT.

To answer your question: I don't believe that use of this data structure 
will cause any performance penalty for the collectives, since the 
structure is just a "trick" for indexing the block of data.  
Additionally, static-vs-dynamic allocation of memory should not have an 
effect on the collectives performance either.

--Paul

sainath l wrote:
> Hi paul,
>
> I have attached my code. The first iteration runs till the 
> deallocation part and then the code breaks.
>
> *** Caught a fatal signal: SIGSEGV(11) on node 0/16
> _pmii_daemon(SIGCHLD): PE 0 exit signal Segmentation fault
> [NID 26]Apid 315852: initiated application termination
>
>
> Also I would be very happy to know, if I want to write 
> micro-benchmarks for the collectives will using this datastructure be 
> of any problem ? (overhead incurred by using this datastructure)  Or 
> should I just declare arrays statically and use them.
>
> In practice, in general, are the source and destination variables of 
> collective operations dynamically allocated ?  If yes will that 
> degrade the perfromance.
>
> Thank you very much.
>
> Cheers,
> Sainath
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jul 24, 2009 at 3:54 AM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov 
> <mailto:PHHargrove_at_lbl_dot_gov>> wrote:
>
>     sainath,
>
>      There is no obvious reason why upc_free() would not work for
>     Gary's datastructure.  Are you sure you are calling upc_free(a)
>     from exactly one thread, and only after all threads have finished
>     accessing the array?  Could you provide more information on how
>     "it breaks"?
>
>     -Paul
>
>     sainath l wrote:
>
>         sorry. that was runtime
>
>         Cheers
>         sainath
>
>
>         On Fri, Jul 24, 2009 at 12:45 AM, sainath l
>         <ls_dot_sainath_at_gmail_dot_com <mailto:ls_dot_sainath_at_gmail_dot_com>
>         <mailto:ls_dot_sainath_at_gmail_dot_com <mailto:ls_dot_sainath_at_gmail_dot_com>>>
>         wrote:
>
>            Its not possible  to deallocate memory using upc_free for the
>            derived data type that is given in Gary's example .
>            ALthough the code compiles without any noise during compile
>         time
>            it breaks.
>            Could someone tell me as to why this is the case. And also is
>            there a way to deallocate the memory for the array of
>         structures
>            in Gary's Example.
>
>            cheers,
>            sainath
>
>
>            On Fri, Jul 24, 2009 at 12:23 AM, sainath l
>         <ls_dot_sainath_at_gmail_dot_com <mailto:ls_dot_sainath_at_gmail_dot_com>
>            <mailto:ls.sainath_at_gmail_dot_com
>         <mailto:ls.sainath_at_gmail_dot_com>>> wrote:
>
>                Hello ,
>
>                Thanks again Gary.
>
>
>                Cheers,
>                sainath
>
>
>
>
>
>
>                On Thu, Jul 23, 2009 at 6:19 AM, Gary Funck
>         <gary_at_intrepid_dot_com <mailto:gary_at_intrepid_dot_com>
>                <mailto:gary_at_intrepid_dot_com <mailto:gary_at_intrepid_dot_com>>>
>         wrote:
>
>                    On 07/23/09 02:00:02, sainath l wrote:
>                    >    I am very much interested in knowing any
>         workaround,
>                    if possible, for
>                    >    dynamically allocating an array with variable
>         block
>                    size at runtime.
>                    >
>                    >    Lets say I want to know if it is possible to
>         create
>                    the following array
>                    >    dynamically where N and M are some variables.
>         If yes
>                    then how can we do
>                    >    it.
>                    >
>                    >    shared [M] int A[N][M];
>
>                    Sainath,
>                    I'm not sure if this is what you're asking about, but
>                    attached is
>                    a program that uses a "trick" to ensure that each
>         row of
>                    the array
>                    has affinity to a single thread, in a thread-cyclic
>         fashion.
>
>                    The trick is that by placing the row vector 'y'
>         inside of
>                    struct, we ensure that y is allocated contiguously on a
>                    given thread.  And for each a[i+1] (based upon UPC's
>                    indexing rules) we know that it will be allocated on
>                    the next thread (in cyclic order) after thread 'i'.
>
>                    $ upc alloc_row_struct.upc -o alloc_row_struct
>                    $ alloc_row_struct -n 4 4 5
>                    threadof a[0].y[0] = 0
>                    threadof a[0].y[1] = 0
>                    threadof a[0].y[2] = 0
>                    threadof a[0].y[3] = 0
>                    threadof a[0].y[4] = 0
>                    threadof a[1].y[0] = 1
>                    threadof a[1].y[1] = 1
>                    threadof a[1].y[2] = 1
>                    threadof a[1].y[3] = 1
>                    threadof a[1].y[4] = 1
>                    threadof a[2].y[0] = 2
>                    threadof a[2].y[1] = 2
>                    threadof a[2].y[2] = 2
>                    threadof a[2].y[3] = 2
>                    threadof a[2].y[4] = 2
>                    threadof a[3].y[0] = 3
>                    threadof a[3].y[1] = 3
>                    threadof a[3].y[2] = 3
>                    threadof a[3].y[3] = 3
>                    threadof a[3].y[4] = 3
>
>                    Above '-n 4' indicates that the program will run on
>         4 threads.
>                    That number was chosen to agree with the value of N
>         (also 4)
>                    given above, but in fact could be any number.
>
>                    Whether this is the best method, or even a recommended
>                    practice,
>                    for accomplishing your objective, I'm not sure.
>          Perhaps
>                    others
>                    on the list can offer some comment or suggest
>         alternative
>                    methods?
>
>                    - Gary
>
>
>
>
>
>
>     -- 
>     Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
>     <mailto:PHHargrove_at_lbl_dot_gov>
>     Future Technologies Group                 Tel: +1-510-495-2352
>     HPC Research Department                   Fax: +1-510-486-6900
>     Lawrence Berkeley National Laboratory    
>
>


-- 
Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
Future Technologies Group                 Tel: +1-510-495-2352
HPC Research Department                   Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory

Next message: sainath l: "Re: Defining block size during runtime"

Previous message: sainath l: "Re: Defining block size during runtime"
In reply to: sainath l: "Re: Defining block size during runtime"
Next in thread: sainath l: "Re: Defining block size during runtime"
Reply: sainath l: "Re: Defining block size during runtime"

Date view	Thread view	Subject view	Author view	Attachment view