Re: Defining block size during runtime

From: sainath l (ls.sainath_at_gmail_dot_com)
Date: Thu Jul 23 2009 - 23:24:29 PDT

  • Next message: Paul H. Hargrove: "Re: Defining block size during runtime"
    Hi paul,
    
    I have attached my code. The first iteration runs till the deallocation part
    and then the code breaks.
    
    *** Caught a fatal signal: SIGSEGV(11) on node 0/16
    _pmii_daemon(SIGCHLD): PE 0 exit signal Segmentation fault
    [NID 26]Apid 315852: initiated application termination
    
    
    Also I would be very happy to know, if I want to write micro-benchmarks for
    the collectives will using this datastructure be of any problem ? (overhead
    incurred by using this datastructure)  Or should I just declare arrays
    statically and use them.
    
    In practice, in general, are the source and destination variables of
    collective operations dynamically allocated ?  If yes will that degrade the
    perfromance.
    
    Thank you very much.
    
    Cheers,
    Sainath
    
    
    
    
    
    
    
    
    
    
    
    On Fri, Jul 24, 2009 at 3:54 AM, Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>wrote:
    
    > sainath,
    >
    >  There is no obvious reason why upc_free() would not work for Gary's
    > datastructure.  Are you sure you are calling upc_free(a) from exactly one
    > thread, and only after all threads have finished accessing the array?  Could
    > you provide more information on how "it breaks"?
    >
    > -Paul
    >
    > sainath l wrote:
    >
    >> sorry. that was runtime
    >>
    >> Cheers
    >> sainath
    >>
    >>
    >> On Fri, Jul 24, 2009 at 12:45 AM, sainath l <ls.sainath_at_gmail_dot_com<mailto:
    >> ls.sainath_at_gmail_dot_com>> wrote:
    >>
    >>    Its not possible  to deallocate memory using upc_free for the
    >>    derived data type that is given in Gary's example .
    >>    ALthough the code compiles without any noise during compile time
    >>    it breaks.
    >>    Could someone tell me as to why this is the case. And also is
    >>    there a way to deallocate the memory for the array of structures
    >>    in Gary's Example.
    >>
    >>    cheers,
    >>    sainath
    >>
    >>
    >>    On Fri, Jul 24, 2009 at 12:23 AM, sainath l <ls.sainath_at_gmail_dot_com
    >>    <mailto:ls.sainath_at_gmail_dot_com>> wrote:
    >>
    >>        Hello ,
    >>
    >>        Thanks again Gary.
    >>
    >>
    >>        Cheers,
    >>        sainath
    >>
    >>
    >>
    >>
    >>
    >>
    >>        On Thu, Jul 23, 2009 at 6:19 AM, Gary Funck <gary_at_intrepid_dot_com
    >>        <mailto:gary_at_intrepid_dot_com>> wrote:
    >>
    >>            On 07/23/09 02:00:02, sainath l wrote:
    >>            >    I am very much interested in knowing any workaround,
    >>            if possible, for
    >>            >    dynamically allocating an array with variable block
    >>            size at runtime.
    >>            >
    >>            >    Lets say I want to know if it is possible to create
    >>            the following array
    >>            >    dynamically where N and M are some variables. If yes
    >>            then how can we do
    >>            >    it.
    >>            >
    >>            >    shared [M] int A[N][M];
    >>
    >>            Sainath,
    >>            I'm not sure if this is what you're asking about, but
    >>            attached is
    >>            a program that uses a "trick" to ensure that each row of
    >>            the array
    >>            has affinity to a single thread, in a thread-cyclic fashion.
    >>
    >>            The trick is that by placing the row vector 'y' inside of
    >>            struct, we ensure that y is allocated contiguously on a
    >>            given thread.  And for each a[i+1] (based upon UPC's
    >>            indexing rules) we know that it will be allocated on
    >>            the next thread (in cyclic order) after thread 'i'.
    >>
    >>            $ upc alloc_row_struct.upc -o alloc_row_struct
    >>            $ alloc_row_struct -n 4 4 5
    >>            threadof a[0].y[0] = 0
    >>            threadof a[0].y[1] = 0
    >>            threadof a[0].y[2] = 0
    >>            threadof a[0].y[3] = 0
    >>            threadof a[0].y[4] = 0
    >>            threadof a[1].y[0] = 1
    >>            threadof a[1].y[1] = 1
    >>            threadof a[1].y[2] = 1
    >>            threadof a[1].y[3] = 1
    >>            threadof a[1].y[4] = 1
    >>            threadof a[2].y[0] = 2
    >>            threadof a[2].y[1] = 2
    >>            threadof a[2].y[2] = 2
    >>            threadof a[2].y[3] = 2
    >>            threadof a[2].y[4] = 2
    >>            threadof a[3].y[0] = 3
    >>            threadof a[3].y[1] = 3
    >>            threadof a[3].y[2] = 3
    >>            threadof a[3].y[3] = 3
    >>            threadof a[3].y[4] = 3
    >>
    >>            Above '-n 4' indicates that the program will run on 4 threads.
    >>            That number was chosen to agree with the value of N (also 4)
    >>            given above, but in fact could be any number.
    >>
    >>            Whether this is the best method, or even a recommended
    >>            practice,
    >>            for accomplishing your objective, I'm not sure.  Perhaps
    >>            others
    >>            on the list can offer some comment or suggest alternative
    >>            methods?
    >>
    >>            - Gary
    >>
    >>
    >>
    >>
    >>
    >
    > --
    > Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    > Future Technologies Group                 Tel: +1-510-495-2352
    > HPC Research Department                   Fax: +1-510-486-6900
    > Lawrence Berkeley National Laboratory
    >
    
    


  • Next message: Paul H. Hargrove: "Re: Defining block size during runtime"