Re: Question regarding blocksize

Date view	Thread view	Subject view	Author view	Attachment view

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Mar 23 2010 - 12:57:03 PDT

Next message: Paul H. Hargrove: "Re: Expense of BUPC timer functions"

Previous message: Yaakoub El Khamra: "Question regarding blocksize"
In reply to: Yaakoub El Khamra: "Question regarding blocksize"
Next in thread: Yaakoub El Khamra: "Re: Question regarding blocksize"
Reply: Yaakoub El Khamra: "Re: Question regarding blocksize"

Yaakoub,

The UPC spec permits an implementation to determine its own maximum 
block size, which is available as the preprocess-time constant 
UPC_MAX_BLOCK_SIZE.

In the case of Berkeley UPC we have the ability to trade off blocksize 
limitations for thread-count limitations, by adjusting at configure time 
how many bits in the 64-bit "packed" representation of a shared pointer 
are used for each field.  By default on a 64-bit systems we devote 20 
bits to "phase" which yields the 2^20 limit on blocksize, and 10 bits 
for "thread" which limits runs one to 1024 UPC threads.  We also have 34 
bits left for "addressing", which limits one to 16GB of shared heap per 
UPC thread.  OR, one can choose to use a 128-bit struct representation 
which is, for most practical purposes, unlimited (2^32 threads, 2^32 max 
blocksize and 2^64 shared heap per thread).  Unfortunately, the struct 
representation results in slightly lower performance.

Given the relatively large node count and memory per node, I cannot see 
a "good" trade-off being selected for Ranger - either one has too few 
thread-bits to come close to spanning the core count of Ranger, or one 
has too small a max blocksize to utilize the large per-core memory via 
large blocksized arrays.  Not every UPC code/user needs all of the 
cores, nor large blocksize arrays, but we cannot assume that no users 
will ever need either.  I think that the next time we build BUPC for 
Ranger we should consider lifting some of these limits (see below).

On the subject of "the next time we build BUPC for Ranger", the version 
available on Ranger via "module load beta upc" is 2.8.0, while 2.10.0 
was released in Nov 2009.  I think it is time (after we pass some 
proposal deadlines on April 6) that we look at building BUPC 2.10.0 on 
Ranger.  AND we can see about addressing the 
max-blocksize-vs-max-threads - what I would propose is that we can use 
our "multiconf" capability to build multiple versions of the runtime (as 
we do for -g vs -O) that are selected based on command line options to 
upcc.  With just some minor config file additions one could have 4 types 
of builds (8 total due to debug-vs-opt) that would be selected from 
based on the presence or absence of two flags made up for this purpose: 
--large-block-size and --large-thread-count (as suggestions).  With 
neither we'd use the defaults, with only --large-block-size we'd 
increase the max block size at the expense of max thread count, with 
only --large-thread-count we'd trade-off in the opposite direction, and 
with both passed we'd use the 128-bit struct representation to eliminate 
the trade off at the expense of some performance.

Not sure who is responsible for what at TACC, so feel free to forward 
the suggested build idea to Victor, Bill or Jim as appropriate.

-Paul

PS
I've set the Reply-To to upc-devel_at_lbl_dot_gov with the expectation that 
we'll be discussing the TACC Ranger builds.
If instead you want to discuss UPC_MAX_BLOCK_SIZE some more, feel free 
to reply to upc-users_at_lbl_dot_gov instead.

Yaakoub El Khamra wrote:
>
> Greetings
> I am starting to prototype a causal sets code with UPC and I am 
> running into the following error: "Maximum block size in this 
> implementation is 1048576". I am using the -O -opt options and this is 
> on ranger.
>
> The message is obvious and decreasing the size of the block does get 
> things working again. However I am wondering if there are any 
> references I can read about the block size limit. Any recommendations 
> or suggestions?
>
> Regards
> Yaakoub El Khamra
>

-- 
Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
Future Technologies Group                 Tel: +1-510-495-2352
HPC Research Department                   Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory

Next message: Paul H. Hargrove: "Re: Expense of BUPC timer functions"

Previous message: Yaakoub El Khamra: "Question regarding blocksize"
In reply to: Yaakoub El Khamra: "Question regarding blocksize"
Next in thread: Yaakoub El Khamra: "Re: Question regarding blocksize"
Reply: Yaakoub El Khamra: "Re: Question regarding blocksize"

Date view	Thread view	Subject view	Author view	Attachment view