Easy Benchmarking with Solar

By | July 12, 2006

Comparative benchmarking is tedious work. It’s not hard to do, it’s just no fun to set up the same scaffold every time you want to figure out how long it takes different pieces of code to execute.

To do a benchmark, essentially you need only four things: a timing mechanism, a loop to run the first case, a loop to run the second case, and a way to show the time taken by each loop.


$loopTimes = 1000;

$before = microtime(true);
for ($i = 0; $i < = $loopTimes; ++$i) {
    // case #1:
    // this code gets run $loopTimes
}
$after = microtime(true);
$case1 = $after - $before;

$before = microtime(true);
for ($i = 0; $i <= $loopTimes; ++$i) {
    // case #2:
    // this code gets run $loopTimes
}
$after = microtime(true);
$case2 = $after - $before;

echo "Case 1 ran $loopTimes in $case1 seconds";
echo "Case 2 ran $loopTimes in $case2 seconds";

This is boring, and repetitive, and essentially the same every time (except for the code cases being benchmarked). This makes it a classic candidate for refactoring to a PHP class.

Enter Solar_Test_Bench. As with Solar_Test, you extend the base class, add your own code cases as bench*() methods, and a runner will execute each of those methods the same number of times and display how long each one took. (It uses Solar_Debug_Timer for the timing mechanism.)

Here’s an example of a real benchmark test from Solar to test the speed difference between require_once() and Solar::loadClass().


Solar::loadClass('Solar_Test_Bench');
class Bench_Solar_LoadClass extends Solar_Test_Bench {

    public $file = 'Solar/Sql.php';
    public $class = 'Solar_Sql';

    public function benchLoadClass()
    {
        Solar::loadClass($this->class);
    }

    public function benchRequireOnce()
    {
        require_once $this->file;
    }
}

To run it, we call the benchmark runner to execute each method in the class 1000 times, and get this output:


bash-2.05$ php bench.php Solar_FileExists 1000
name             : diff     : total
__start          : 0.000000 : 0.000000
benchLoadClass   : 0.005793 : 0.005793
benchRequireOnce : 0.034572 : 0.040365
__stop           : 0.000012 : 0.040377
bash-2.05$

You would want to run it a few times to get a better idea of the real speed, but the above is a representative result from my MacBook 2.0GHz. Looking at the “diff” column we can see that Solar::loadClass() is about 5 times faster than using require_once(). (Thanks to Clay Loveless for thinking up this particular benchmark.)

Here’s another one:

Currently, Solar::fileExists() uses an fopen() call to see if a file exists on the include_path. (The regular file_exists() and is_readable() PHP calls do not check the include path.) However, Clay Loveless wondered if it would be faster to explode() the include_path and see if the file_exists() on each of those individually.

I won’t copy the benchmarking class here; you can see it at Bench_Solar_FileExists. We actually test three different cases; the current fopen() case, and two new versions of using include_path. The results are:


bash-2.05$ php bench.php Solar_LoadClass 1000
name                   : diff     : total
__start                : 0.000000 : 0.000000
benchFopen             : 0.040105 : 0.040105
benchExplodeInclPath   : 0.015582 : 0.055687
benchNoExplodeInclPath : 0.020612 : 0.076299
__stop                 : 0.000004 : 0.076303
bash-2.05$

So the second case, where we explode() the include_path and then check file_exists() on each of the resulting paths, is about 3 times faster than using fopen(). The next release of Solar will use that code instead of the current fopen() code as a result.

(Note: the benchmarking classes are in Subversion only at this point, but will be in the next release of Solar when it comes out.)

6 thoughts on “Easy Benchmarking with Solar

  1. Pingback: PHPDeveloper.org

  2. jules

    The last items are interesting, well, all things about The-Include_Path are interesting. What’s the difference between explode and noexplode, why isn’t the dot substituted with getcwd and realpath, and last but not least, how many paths were examined in the bench above?

    Reply
  3. Jeff Moore

    I like your benchmarking class.

    You might consider benchmarking an empty method to measure the overhead involved with your loop and method call, then subtracting that from your times. This would give you a more accurate measurement of the relative time between the different techniques.

    Additionally, one has to be very careful drawing conclusions from micro benchmarks such as this.

    For example, the results of file_exists are cached internally. That means that the first call to the function may have a very different performance profile than the next 999 calls to the function in the benchmark. However, when executing a real program, are repeated calls to Solar::fileExists() on the same file likely? If so, how many are likely?

    For a cached function such as files_exists, its easy to make a “warm up” call outside the loop in order to measure the subsequent portion of the performance curve. However, measuring the time of the first call may be harder to get within microtime() resolution.

    Reply
  4. Jeff Moore

    Oh, I think you could take a real program and profile with apd or Xdebug to get the actual number of calls to require_once and then subtract count(get_included_files()) to determine the number of “subsequent” calls to require_once. Not sure that would work, haven’t tried it, but it would be an interesting number to see for some real programs (say wordpress).

    Reply
  5. pmjones Post author

    Hi Jeff –

    Good call on the stat cache; I have added clearstatcache() to the top of the second and third cases to make sure file_exists() isn’t using the cache, and the results are very similar:

    
    name                   : diff     : total
    __start                : 0.000000 : 0.000000
    benchFopen             : 0.041378 : 0.041378
    benchExplodeInclPath   : 0.015690 : 0.057068
    benchNoExplodeInclPath : 0.020634 : 0.077702
    __stop                 : 0.000005 : 0.077707
    

    So barring other missed nuances, it appears the stat cache is not much of a factor when comparing fopen() to file_exists(), at least not in these tests.

    Reply
  6. Lukas

    Intresting. My implementation in LiveUser and MDB2 initially used explode and is_readable(). However it causes issues:
    http://marc.theaimsgroup.com/?l=pear-dev&m=114148949106207&w=2

    Now in that thread people (maybe even I myself) claimed that fopen() would be faster. I think the idea was also to work around safe_mode issues. Not sure if using file_exists() instead of is_readable() solves this.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *