Re: Expense of BUPC timer functions

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Mar 23 2010 - 00:48:26 PDT

  • Next message: Nikita Andreev: "RE: Expense of BUPC timer functions"
    Hello, Nikita.
    The fact that you are using a virtual machine is probably not perturbing 
    the cost of bupc_ticks_now(), at least not to the extent you report 
    seeing. Assuming a modern AMD or Intel CPU, this function is using the 
    RDTCS instruction on most OSes, which should be quite cheap.
    The observation that the first call to bupc_ticks_to_ns() is more 
    expensive than later calls is to be expected. The first instance parses 
    /proc/cpuinfo to get the clock rate and stores the value for reuse in 
    subsequent calls.
    Running on a 2.3GHz Intel Xeon E5410 from a Xen Dom0 kernel, the output 
    from your attached program is
    Get 1 tick: 33ns. 1 convert: 2ns
    I tried on a Xen HVM DomU running on the same machine:
    Get 1 tick: 34ns. 1 convert: 2ns
    And on a Xen PV DomU on the same machine:
    Get 1 tick: 33ns. 1 convert: 2ns
    So virtualization is probably not a significant factor, at least under Xen
    On an older 2.2GHz Opteron which is not running Xen I see
    Get 1 tick: 6ns. 1 convert: 2ns
    And an old 2.8Ghz Pentium-4 yields
    Get 1 tick: 82ns. 1 convert: 4ns
    So, there can be significant variation among platforms, and I don't know 
    if the 33-vs-6 difference is Xen related or not.
    So, I will agree with you that the relative cost of query and conversion 
    are not ordered as the documentation suggests.
    However, I can't reproduce the 2384ns query overhead.
    If you can tell me more about the platform you are running on perhaps I 
    could understand the extraordinarily high query cost you report.
    Nikita Andreev wrote:
    > Hello Paul and all,
    > Iím measuring the overhead of bupc_ticks_now() and bupc_ticks_to_ns() 
    > and results doesnít look like I expected. Find the test attached.
    > Iíve made 1 million iterations and have got the following:
    > bupc_ticks_now: 2383ns
    > bupc_ticks_to_ns: 4ns
    > So the conversion is made lot faster than query. But documentation 
    > says: The bupc_ticks_to_{us,ns}() conversion calls can be 
    > significantly more expensive than the bupc_ticks_now() tick query.
    > What I also noticed is that first bupc_tick_to_ns call in a loop is 
    > very slow. It can take even 1,5 milliseconds. And almost all others 
    > are very fast.
    > What am I missing here?
    > P.S. Iíve performed this test on a virtual machine if it makes any 
    > difference.
    > Regards,
    > Nikita Andreev
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 Tel: +1-510-495-2352
    HPC Research Department                   Fax: +1-510-486-6900
    Lawrence Berkeley National Laboratory     

  • Next message: Nikita Andreev: "RE: Expense of BUPC timer functions"