A Bit About Benchmarks

As the author of a relatively popular benchmarking article, I feel compelled to respond to this bit of misguided analysis from the Symfony camp about benchmarks.

Full disclosure: I am the lead developer on the Solar framework, and was a founding contributor to the Zend framework.

M. Zaninotto sets up a number of straw-man arguments regarding comparative benchmarks in general, although he does not link to any specific research. In doing so, he misses the point of comparative benchmarking almost entirely. Herein I will address some of M. Zaninotto’s arguments individually in reference to my previous benchmarking series.

All of the following commentary regards benchmarking and its usefulness in decision-making, and should not be construed as a general-purpose endorsement or indictment of any particular framework. Some frameworks are slower than others, and some are faster, and I think knowing “how fast is the framework?” is an important consideration when allocating scarce resources like time, money, servers, etc.

And now, on to a point-by-point response!

Symfony is not slow in the meaning of “not optimized”

But it *is* slow in the meaning of “relative to other frameworks.”

Regarding the title of M. Zaninotto’s article, I don’t know of any reputable benchmark projects that conclude Symfony is “too slow for real-world usage” in general. (Perhaps M. Zaninotto would link to such a statement?) Of course, the definition of “real-world” is subjective; the requirements of some applications are not necessarily the same as others.

What is not subjective is the responsiveness of Symfony when compared to other frameworks in a controlled scenario: for a target of 500 req/sec, you are likely to need more servers to balance across with Symfony than with Cake, Solar or Zend. This is implied by my earlier benchmarking article.

If some benchmarks show that symfony is slower, jumping to the conclusion that symfony is not optimized is a big mistake.

I don’t know of any comparative benchmark research that concludes “Symfony is not optimized.” M. Zaninotto is arguing against a point that no benchmark project seems to be making. (Note that the benchmarks I generated explicitly attempt use each framework in its most-optimized dynamic state, including opcode caching. You can even download the source of the benchmarking code to see what the optimizations are.)

I’d say that people who take this shortcut are either way smarter than us, or they don’t know what profiling is, they didn’t look at the symfony code, and they can’t make the difference between efficient code and a bottle of beer.

Profiling to optimize a subset of code lines is not the same as benchmarking the responsiveness of a dynamic controller/action/view dispatch sequence. (The speed of the code blocks together are taken into account by the nature of the benchmark.)

So for instance, you will not find this code in symfony:

for ($i = 0; $i<count($my_array); $i++)

instead, we try to always do:

for ($i = 0, $count = count($my_array); $i<$count; $i++)

This is because we know it makes a difference.

How do we know it? Because we measured. We do use profiling tools a lot, on our own applications as well as on symfony itself. In fact, if you look at the symfony code, you can see that there are numerous optimizations already in place.

I agree that the second code-block is much better than the first speedwise. (N.b.: the first one calls count() on each cycle of the loop, whereas the second one calls count() only once.)

But if that faster piece is called only once or twice, and another much-slower piece is called 2 or three times, the overall effect is to slow down the system as a whole. Optimizing individual blocks of code does not necessarily result in a fast system as a whole.

And if you use a profiling tool yourself on a symfony application, you will probably see that there is no way to significantly optimize symfony without cutting through its features.

… at least, not without rewriting the system as a whole using a different and more-responsive architecture.

Of course, there might still be a lot of small optimizations possible here and there.

I think one would need a lot of “small optimizations” to make the 41 percentage-point gain necessary to equal the next faster dispatch cycle of Cake (per my benchmarking article; your mileage may vary).

Symfony results from a vision of what would the perfect tool for developers, based on our experience. For instance, we decided that output escaping should be included by default, and that configuration should be written preferably in YAML. This is because output escaping protects applications from cross-site scripting (XSS) attacks, and because YAML files are much easier to read and write than XML. I could name similar arguments about security, validation, multiple environments, and all the other features of symfony. We didn’t add them to symfony because it was fun. We added them because you need them in almost every web application.

I don’t see how this is different from how Cake, Solar, or Zend approached their development process. Each of those frameworks has output escaping, configuration (either by YAML or by much-faster native PHP arrays), security, validation, multiple environment support, etc. (Those frameworks still perform a dynamic controller/action/view dispatch faster than Symfony does.)

It is very easy to add a new server to boost the performance of a website, it is very hard to add a new developer to an existing project to make it complete faster. Benchmarking the speed of a “Hello, world” script makes little sense

M. Zaninotto completely misses the point here.

At least for my own benchmarking series, the purpose is not to merely to say “this one is faster!” but to say “you can only get so much responsiveness from any particular framework, I wonder how each compares to other frameworks?”

A “hello world” application is the simplest possible thing you can do with a dynamic controller/action/view dispatch, and so it marks the most-responsive point of the framework. Your application code cannot get faster than the framework it’s based on, and the “hello world” app tells you how fast the framework is.

Based on that information, you can get an idea how many servers you will need to handle a particular requests-per-second load. Based on my benchmarking, you are likely to need more servers with a Symfony-based app than with a comparable application in Cake, Solar, or Zend. This is about resource usage prediction, not speed for its own sake.

Using plain PHP will make your application way faster than using a framework. Nevertheless, none of the framework benchmarks actually compare frameworks to a naked language.

Incorrect; my benchmarking series specifically compares all the frameworks to a plain PHP “echo ‘hello world’” so you can see what the responsiveness limits are for PHP itself. I also compare the responsiveness of serving a plain-text ‘hello world’ file without PHP, to see what the limits are for the web server. These numbers become important for caching static and semi-static pages.

… none of the framework benchmarks actually compare frameworks to a naked language. This is because it doesn’t make sense.

Incorrect again. It does make sense to do so, because you can use a framework to cache a static or semi-static page. Caching lets you avoid the dynamic controller/action/view dispatch cycle and improve responsiveness dramatically. However, if your requests-per-second requirements are higher even than that provided by caching, you’ll still need more servers to handle the load. Again, this is about resource usage, not speed per se.

If frameworks exist, it is not for the purpose of speed, it is for ease of programming, and to decrease the cost of a line of code. This cost not only consists of the time to write it, but also the time to test it, to refactor it, to deploy it, to host it, and to maintain it over several years.

Ease of programming is a valid concern … and so is resource usage. If you can get comparable ease-of-use in a different framework, and it’s also more responsive, it would seem to make sense to use the less resource-intensive one. (Of course, measuring ease-of-use and programmer productivity is much harder than measuring responsiveness – the plural of “anecdote” is not “data”. 😉

It doesn’t make much more sense to compare frameworks that don’t propose the same number of features. The next time you see symfony compared with another framework on a “hello, world”, try to check if the other framework has i18n, output escaping, Ajax helpers, validation, and ORM turned on. It will probably not be the case, so it’s like comparing pears and apples.

I completely agree: one must compare like with like. And my benchmarking series attempts exactly that: all features that can be turned off are turned off: no ORM, no helpers, no validation, etc. Only the speed of the controller/action/view dispatch cycle is benchmarked, and Symfony still came out as the least-responsive with all those fetaures turned off.

Also, how often do you see pages displaying “hello, world” in real life web applications? I never saw one. Web applications nowadays rely on very dynamic pages, with a large amount of code dedicated to the hip features dealing with communities, mashups, Ajax and rich UI. Surprisingly, such pages are never tested in framework benchmarks. And besides, even if you had a rough idea of the difference in performance between two frameworks on a complex page, you would have to balance this result with the time necessary to develop this very page with each framework.

M. Zaninotto is again missing the point; the idea is not to generate “hello world” but to see what the fastest response time for the framework is. You can’t do much less than “hello world”, so generating that kind of page measures the responseiveness of the framework itself, not the application built on top of the framework.

In a way, the above is M. Zaninotto’s strongest point. Any Ajax, rich UI, and other features you add after “hello world” will only reduce responsiveness, but it is difficult to measure how much they reduce responsiveness in a controlled manner (especially when comparing frameworks). It may be that some frameworks will degrade at a faster rate than others as these features are added. Having said that, Symfony starts at a much lower point on the responsiveness scale than other frameworks, so it doesn’t have as much leeway as other frameworks do.

The speed of a framework is not the most important argument

While not the most important argument, it is *an* important argument. And it is one we can reliably measure if we are careful – at least in comparison to other frameworks. Ignoring it is to ignore one of many important considerations.

And between two framework alternatives with comparable speed, a company will look at other factors to make a good decision.

Agreed – when the speeds are comparable, other factors will have stronger weight. This was the point of benchmarking a “hello world” implementation: to compare speed/responsiveness in a controlled fashion.

And if you need a second opinion, because you can’t believe what the creator of a framework says about his own framework, perhaps you could listen to other companies who did choose symfony. Yahoo! picked symfony for a 20 Million users application, and I bet they measured the speed of a “hello, world” and other factors before making that decision. Many other large companies picked the symfony framework for applications that are crucial to their business, and they continue to trust us.

M. Zaninotto “bets” they measured it, but does not say “they did” measure it. I would be interested to hear what Yahoo themselves have to say about that experience. All public references to this seem to be from Symfony developers and user sites, not the Yahoo project team. (Yahoo folks, please feel free to contact me directly, pmjones88 -at- gmail -dot- com, or to leave a comment on this page.)

This page from the Symfony developers says that documentation, not speed, was Yahoo’s “first reason” to choose Symfony. It also says that Yahoo “extended and modified symfony to fit their needs,” which is plenty possible with Cake, Solar, and Zend.

Perhaps this is an example of a developer at Yahoo who used Symfony not because he compared it to other frameworks, but because he was already familiar with it or liked the way it looked. That would be perfectly fair, I think; we all pick what we like and then try to popularize it. But did Yahoo actually do a cost-benefit study (or even a simple “hello world” implementation comparison) ?

While we’re at it, how much hardware does it take for Yahoo to serve up the bookmarks application? Yahoo can afford to throw more servers at an application than most of us – a framework with better responsiveness (and thus needing fewer servers to balance across) is sure to become an important factor.

Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving your codebase, all while keeping it running the whole time.

17 thoughts on “A Bit About Benchmarks

  1. Well Paul, maybe you apply his blog post a bit too strongly against your post. It doesnt seem that he links to any particular benchmark he talks about. Then again he also uses a lot of language like “none”, “any” etc when talking about what benchmarks do and dont do.

    At any rate, yes most frameworks will come with out of the box tools to generate static pages, partially cache things etc. They will also come with all sorts of goodies to make development easier. More importantly they will make different assumptions about what people will need as the lowest common denominator and as such they will build their request stack differently to tailor to these assumptions. And some will be “more” right about these assumptions for some kind of projects and developers.

    In the end the problem with all of these benchmarks is that none of them are able to compare the full development and production metrics of a real world application. Because its simply impossible. As such a “hello world” benchmark may be used as an indicator, but I fear its quite useless for 99% of the people out there. I am sure I could make the “hello world” example perform significantly faster on any of the tested frameworks if you give me a day or two on the source. This is a bold statement which I am not backing up and currently do not have plans to back up with facts.

    I remember that post that Terry linked to of one of the Twitter developers that said in order to get RoR to scale they had to stop using most of what attracts most people to RoR. As such, I have stopped using nice features in frameworks when I begun noticing performance issues. Sometimes it was painful, sometimes it was easy to do. This is the reality of what people do when they need to get high performance from an off the shelf framework and this is what we can all do if we use an open source framework. We can rip out part of the guts, replace them with code that follows our assumptions, rather than those of the original lead developers. How painful or easy this is, is again something a benchmark will not tell you.

    So what is the 1% if usefulness I see in Paul’s benchmark? Its a starting point in order to do your own benchmarks. Actually I guess I am cutting it short with 1% for the sole reason that I think it was a bad idea to present it in the way he did as something else than a starting point for doing your own benchmarks. I am sure Paul can show me various disclaimers on his post, where he said exactly this, but the fact is that the community at large saw in this the answer to the question “what framework is faster”.

  2. I’m not sure if a hello world test provides a useful comparison. Let’s say foo is 100 and bar is 90. If you draw a graph showing foo and bar, and set the baseline to 80, it appears that foo is twice as fast as bar but if you set the baseline to zero it’s easier to see that the difference is a much less dramatic 10%. A “normal” page with logic and database queries would show much smaller differences between the frameworks and if none of the frameworks consume a significant portion of total resources these differences won’t matter.

  3. I still prefer benching my coding time than an Hello World ! app performances, but it’s just a question of point of view.

  4. Yes.. it is optimized, but not optimized enough
    for example, this code
    for ($i = 0, $count = count($my_array); $i
    can be
    for ($i = 0, $count = count($my_array); $i

    $count = count($my_array);

    most people don't like while loop though non-assoc arrays... so use for if you want..

    a even faster one would be loop an array the other way around since it eliminates a comparison.
    for ($i = count($my_array)-1; $i; --$i)

  5. While we are on the topic of benchmarking, I noticed your comment about PHP arrays being even faster. I recently did microbenchmarks and I found some very surprising results:

    They are so surprising, that I am hoping that some other people would run them and compare them against my numbers and also just look over the code to ensure I am not drawing incorrect conclusions.

  6. Hi Lukas — in your microbenches, I don’t see the equivalent of using “$config = include ‘config.php’;” where config.php is “return array(‘foo’ => ‘bar’);”. No serializing or var_export(), just a plain PHP array being returned by an include. Would that be appropriate to have in the list there?

  7. Well I measure the writing and reading separately. So the reading case for var_export() is exactly the classic “PHP arrays in a file that are read via include” case.

    Here is the code, as you can see I repeat the reads multiple times to illustrate the fact that you will probably do way more reads than writes (in my benchmark run it was 1:1000):

    file_put_contents(‘tmp3.cache’, “

  8. Hi all..

    I pretty much have to agree with Lukas.
    I dont really understand the intention behind a “hello world” test with fully, featured application frameworks.

    Frameworks are only usefull for applications which do a little more, than printing out static pages. And this was pretty much M. Zaninottos point – that no one should judge a framework based on a “hello world” benchmark – and there I absolutly agree.

  9. If speed was the key to adoption and success, I think Struts is an UFO, as all the most used frameworks in the IT world.

  10. A “hello world” benchmark could show us the overheads of a framework.

    The overheads are not worth for every projects/software.

  11. Even though Lukas may be right in that the framework communities found this as an statement about which framework is faster in general, that is still not an excuse for the fact that they still have missed the point about this benchmark.

    Paul didn’t create a starting point for benchmarks (even though that may be a side effect). He benchmarked the current state of framework request dispatch responsiveness at that time. That’s all there is to it.

    There’s no need to understand the “point” of benchmarking an hello world app — because he didn’t benchmark an hello world app!

  12. @Andreas..

    Question: How did Paul benchmark the “current state of framework request dispatch responsiveness”?
    Answer: With a “hello world”-app.

    It is still a test of the response-time of the framework serving a single static page. Yes it shows, that one framework servs a static page faster than another one. But my question still is: So what? No one uses a framework to serve a single static page. If u want to test frameworks use a “real app” with database connections and.. and.. and..

    The reason I think, why such benchmarks are a little “dangerous” is: Many not much experienced developers read such benchmarks and think… “hey this one ist the fastest – cool its the best”. (Just read the replys to the numerous benchmarks on the web).
    I dont want to prevent anyone frome making his decisions based on such benchmarks – but I want to tell them, that this would be a mistake.

  13. @samoht:

    Your question: so what?
    The answer: Solar is more responsive than the other frameworks as measured in this benchmark.

    Measuring a complete application (which would be different from measuring a framework) would be impossible and yield uncomparable results, especially since Solar does not endorse/force any specific scheme for database access yet (which for most frameworks is the bottleneck in an application).

    One approach to measure theoretical “real world” performance would be splitting out the ORM systems and compare those, splitting out the template/view systems and compare those, and (guess what’s next) benchmark the bare bone frameworks (request handling and dispatching) themselves (like Paul did). The sum of these could possibly indicate which framework would be faster. And that’s not even considering scaling issues!

    If someone demands a responsive framework, the best choice based on Pauls benchmark would be Solar. I don’t see anything wrong with that. However basing the choice of framework only on a single narrow benchmark is a stupid act and this responsibility couldn’t possibly be attributed to Paul for publishing his benchmark results.

    Dangerous benchmarks? Nah, only dangerous decision makers.

Leave a Reply

Your email address will not be published. Required fields are marked *