Solving the N+1 Problem; or, “A Stitch In Time Saves Nine”

My article for this year’s PHP Advent is up. In it, I show one way to deal with the N+1 problem in your PHP code, and present a reusable package to help with it. Since PHP Advent doesn’t support comments, please leave your comments about the article below.

UPDATE (08 Dec 2011): Per request, here’s a link to the Aura.Marshal package.

Are you stuck with a legacy PHP application? Subscribe to "Modernizing Legacy Applications in PHP" for tips, tools, and techniques that can help you improve your codebase and your work life!

9 thoughts on “Solving the N+1 Problem; or, “A Stitch In Time Saves Nine”

    • Hi Ondrej — The only “problem” with NotORM, in relation to the article’s conclusion, is that it does queries for you (as with most other ORMs). The point of Aura.Marshal is that it *doesn’t* do those for you; it leaves you as the developer, for good or bad, in complete control of the queries that fill the types.

  1. Hi Paul,

    I enjoyed your article, and it makes sense. However there’s a possible issue that I’ve run into recently when developing with PHP & MySQL. In my app it seems that MySQL doesn’t optimize IN() queries very well – I ended up in a full-table scan when using one recently, (I’m not sure about other DBs), so it might actually be faster to run several queries, especially if the latency between the app server and the DB server is low, or if we can find another way to select the second set of data, (which is what I ended up doing in my situation).

  2. Hi John B:

    First off, I am not a DBA, so you should consult one regarding optimization. Having said that, two things to check:

    1. Since you used the phrase “table scan” my guess is that you have already done this, but it bears mentioning: make sure you have an index on the field you’re querying on.

    2. The table size matters to the optimizer. For tables with a certain number of rows or fewer, the optimizer figures that the time to look through an index and then correlate back to the noted rows is more expensive than simply scanning the table. It is possible this is the case in your situation.

    Regarding multiple queries vs. a table scan: I find it difficult to believe that issuing (in terms of the article) 20 or 30 queries would be more performant than issuing 1 query that uses a table scan. But, “what I believe” and “what is actually true” should be subject to a test of some sort. That’s the sort of thing you would want to benchmark. (When benchmarking, be sure to reduce variability, and to bench against a system that closely resembles production, including the volume of data in production.)

  3. In the system I was testing on I was using the IN() operator with a list of primary key values, (the PK is just an autoincrement column), so there was an index. The table was ~160k rows, and there was a join of some sort involved, (although the where condition didn’t use anything from the joins, it only had the IN(pk list)).

    I had an unused column available so I just set it to greater than 0 on the rows I wanted and changed my WHERE to WHERE unusedcol > 0. That got me more speed, strangely enough.

    As you say, benchmarking, (or at least some real-world testing), is the way to go. You never know when something weird’s going to happen in the database engine, and every production environment is different. I just thought I would share my experience, since it’s not what would be expected, (if it was expected, I would never have written the query with an IN() in the first place!).

  4. @John B – what version of MySQL are you using? You might want to try the latest Percona Server, a performance-patched MySQL 5.5 maintained by Percona (http://percona.com). Some of the older MySQL versions have some real quirks regarding query optimizations.

  5. @Paul – great article! One suggestion, because I’m lazy, would be to link to your marshalling class from this page also so I don’t have to go back to the PHP Advent article and find the inline link…

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>