9 thoughts on “Solving the N+1 Problem; or, "A Stitch In Time Saves Nine"

    1. pmjones Post author

      Hi Ondrej — The only “problem” with NotORM, in relation to the article’s conclusion, is that it does queries for you (as with most other ORMs). The point of Aura.Marshal is that it *doesn’t* do those for you; it leaves you as the developer, for good or bad, in complete control of the queries that fill the types.

      Reply
  1. John B

    Hi Paul,

    I enjoyed your article, and it makes sense. However there’s a possible issue that I’ve run into recently when developing with PHP & MySQL. In my app it seems that MySQL doesn’t optimize IN() queries very well – I ended up in a full-table scan when using one recently, (I’m not sure about other DBs), so it might actually be faster to run several queries, especially if the latency between the app server and the DB server is low, or if we can find another way to select the second set of data, (which is what I ended up doing in my situation).

    Reply
  2. pmjones Post author

    Hi John B:

    First off, I am not a DBA, so you should consult one regarding optimization. Having said that, two things to check:

    1. Since you used the phrase “table scan” my guess is that you have already done this, but it bears mentioning: make sure you have an index on the field you’re querying on.

    2. The table size matters to the optimizer. For tables with a certain number of rows or fewer, the optimizer figures that the time to look through an index and then correlate back to the noted rows is more expensive than simply scanning the table. It is possible this is the case in your situation.

    Regarding multiple queries vs. a table scan: I find it difficult to believe that issuing (in terms of the article) 20 or 30 queries would be more performant than issuing 1 query that uses a table scan. But, “what I believe” and “what is actually true” should be subject to a test of some sort. That’s the sort of thing you would want to benchmark. (When benchmarking, be sure to reduce variability, and to bench against a system that closely resembles production, including the volume of data in production.)

    Reply
  3. John B

    In the system I was testing on I was using the IN() operator with a list of primary key values, (the PK is just an autoincrement column), so there was an index. The table was ~160k rows, and there was a join of some sort involved, (although the where condition didn’t use anything from the joins, it only had the IN(pk list)).

    I had an unused column available so I just set it to greater than 0 on the rows I wanted and changed my WHERE to WHERE unusedcol > 0. That got me more speed, strangely enough.

    As you say, benchmarking, (or at least some real-world testing), is the way to go. You never know when something weird’s going to happen in the database engine, and every production environment is different. I just thought I would share my experience, since it’s not what would be expected, (if it was expected, I would never have written the query with an IN() in the first place!).

    Reply
  4. pmjones Post author

    “You never know when something weird’s going to happen in the database engine” — absolutely true.

    Reply
  5. Jonathon Hill

    @John B – what version of MySQL are you using? You might want to try the latest Percona Server, a performance-patched MySQL 5.5 maintained by Percona (http://percona.com). Some of the older MySQL versions have some real quirks regarding query optimizations.

    Reply
  6. Jonathon Hill

    @Paul – great article! One suggestion, because I’m lazy, would be to link to your marshalling class from this page also so I don’t have to go back to the PHP Advent article and find the inline link…

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *