Efficient use of mysqli_result::$num_rows

I frequently see this pattern in legacy applications using mysqli:

$result = $mysqli->query("
    SELECT *
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->num_rows > 0) {
    return;
}

The developer’s intent here is to see if there are any rows at all in the database that match a certain condition. He does so by issuing a query, then asking the result object how many rows it has. The developer doesn’t actually want any data from the result, and doesn’t care about the actual row-count itself; this is just a check to see if at least one row exists in the database.

This is a poor conservation of resources. The database does the work needed to select all the columns for all the rows matching the conditions, allocates memory for them, and returns them. But the developer discards all that immediately.

To accomplish the same end, it is less resource-intensive and just as effective to query for a single column and limit the results to a single row:

$result = $mysqli->query("
    SELECT col_name
    FROM table_name
    WHERE conditions = 'whatever'
    LIMIT 1
");
if ($result && $result->num_rows > 0) {
    return;
}

Now the database only does the work needed for a single column and a single row.

(As a side note, I find it interesting that I have not seen this pattern at all in projects using PDO. I’m not sure why this would be. Perhaps there is some originating example code for mysqli somewhere that has gained a life of its own through copying and reuse.)

UPDATE: Perhaps a better way to conserve resources, courtesy of Reddit user marcjschmidt, is to use a COUNT() in the query, then fetch the count of rows, something more like this …

$result = $mysqli->query("
    SELECT COUNT(*)
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->fetch_array()[0] > 0) {
    return;
}

… thereby avoiding the use of mysqli_result::$num_rows completely.

UPDATE 2: Lots of commentary in the above Reddit thread. To summarize this blog post: selecting all columns of all rows, then examining $num_rows, and then discarding the result set, is a terrible way of determining if there are any matching rows at all in the database. It is trivially easy to something much better, whether by using a LIMIT 1 and $num_rows, or some form of COUNT(), or perhaps some other approach.

Exporting Globals in PHP

I am currently modernizing a legacy PHP application for a client. (The codebase was written earlier this year, in fact; new code can be “legacy” from the outset.) The original developer pulled a dirty trick with global that I had not seen before, and I thought I had seen everything.

Legacy codebases often use global to import a variable into the local scope, usually a global function. For example, they might drag in a database connection:

<?php
// define $db in a config file somewhere
$db = new DatabaseConnection(...);

// this function uses the $db connection via global
function fetch_user_by_id($id)
{
    global $db;
    return $db->fetchAssoc("SELECT * FROM users WHERE id = ?", $id);
}
?>

I see that kind of thing all the time in legacy PHP. However, what I have not seen before is a function exporting a global.

Take a look at the following code. If the $bar variable is not already defined in the global scope, PHP will define it in the global scope for you automatically when you call foo().

<?php
error_reporting(E_ALL);

function foo()
{
    global $bar;
    $bar ++;
}

// $bar is not defined yet, so PHP will show an
// "undefined variable" notice
echo $bar. PHP_EOL;

// calling foo() defines $bar in the global scope,
// and increments it
foo();

// $bar is now available in the global scope, having
// been exported from function foo()
echo $bar. PHP_EOL;

The legacy developer did that because he wanted to keep the variable initialization outside of the global scope for some reason, even though he used the variable in the global scope elsewhere.

It is exceptionally difficult to track down where an exported global is coming from when refactoring a legacy application. If you must write legacy code using globals, initialize them in the global scope. Better yet, don’t use globals at all: pass values as function arguments, or use dependency injection techniques.

Why Do PHP Developers Think MVC Is An Application Architecture?

I’ve pointed out before that Model-View-Controller is a user interface pattern, not an application architecture. But why would PHP developers get the idea that MVC is an application architecture in the first place? (This may apply to all server-side developers, not just PHP folks.)

I used to think that MVC was an application architecture. Even after reading Fowler’s POEAA and seeing that MVC was for the user interface, I figured that meant I was doing “user interface applications.” But that was not quite right; it would have been more accurate to say that I had been mixing the concerns of user interface with the underlying core application.

Based on that experience, I think the reason MVC is mistaken for an application architecture is because PHP developers generally start with page scripts. In our first page scripts, we combine all the concerns in a single ball of mud: SQL queries are intermingled with HTML, and the business logic is scattered throughout. As far as we are concerned, the page script itself is “the application.” Later, as more functionality is required, we add another page script, and another, and another. We continue to see that collection of page scripts as “the application.”

Then, as we progress in our profession, and begin to learn how to better organize our work, we start separating the different concerns of “the application.” When we begin separating concerns, we separate them out of the page script, which we see as “the application.” Extracting the view concerns from the page script means extracting them from “the application.” Separating the model from the page script into its own layer means separating it from “the application.” Pulling the controller logic out of the page script means pulling it out of “the application.” So of course Model-View-Controller is seen as an application architecture – we separated the concerns of our application according to that pattern, right?

Except, in retrospect, it’s not. One of the big leaps we have to make is to realize that MVC is for the user interface portion of our systems, just like Fowler notes. We on the server side think the user interface is HTML, CSS, and Javascript, but it’s not. Instead, the user interface is the HTTP Request and Response. In other words, the template is not the View.

Once we make that conceptual leap, we begin to realize that the Model layer is the entry point to “the application”. That is where “the application” should live, or at least where it should look like it lives. The Controller should have little to do but take input from the Request and pass it to the Model; likewise, the View should do nothing but put together the Response using the output from the Model.

With that idea of server-side MVC, we then begin to see that a lot of what’s in server-side MVC frameworks is “too much.” Framework functionality that is not related to merely taking input from a Request and presenting output through a Response becomes entirely secondary to, perhaps even actively harmful to, building well-structured applications – applications that are independent of any particular user interface.

Afterword

The server-side MVC pattern, as descended through Sun’s Model 2 architecture, is so distinctive from the client-side MVC pattern as to be a completely different thing. I realized this as I was writing my book on modernizing legacy applications, and further research led me to write up Action-Domain-Responder as a web-specific alternative to MVC. Action-Domain-Responder places a much stronger emphasis on the HTTP Request and Response as the user interface elements. If you are interested in building better applications, you may wish to read the ADR essay, and try it out in your next project.

Configuration Values Are Dependencies, Too

As part of my consulting work, I get the opportunity to review lots of different codebases of varying modernity. One thing I’ve noticed with some otherwise-modern codebases is that they often “reach out” from inside a class to retrieve configuration values, instead of injecting those values into the class from the outside. That is, they use an equivalent of globals or service-location to read configuration, instead of using dependency injection.

Here is one generic example:

<?php
class Db
{
    // backend type, hostname, username, password, and database name
    protected $type, $host, $user, $pass, $name;

    public function __construct()
    {
        $this->type = getenv('DB_TYPE');
        $this->host = getenv('DB_HOST');
        $this->user = getenv('DB_USER');
        $this->pass = getenv('DB_PASS');
        $this->name = getenv('DB_NAME');
    }

    public function newConnection()
    {
        return new PDO(
            "{$this->type}:host={$this->host};dbname={$this->name}",
            $this->user,
            $this->pass
        );
    }
}
?>

Granted, the example follows the modern practice of keeping sensitive information as environment variables. Similar examples use $_ENV or $_SERVER keys instead of getenv(). The effect, though, is global-ish or service-locator-ish in nature: the class is reaching outside its own scope to retrieve values it needs for its own operation. Likewise, one cannot tell from the outside the class what configuration values it depends on.

Is the following any better?

<?php
class Db
{
    public function __construct()
    {
        $this->type = Config::get('db.type');
        $this->host = Config::get('db.host');
        $this->user = Config::get('db.user');
        $this->pass = Config::get('db.pass');
        $this->name = Config::get('db.name');
    }
}
?>

As far as I can tell, that’s a variation on the same theme. The generic Config object acts as a global singleton to carry configuration for every possible need; it is acting as a static service locator. While service location is inversion-of-control, it is in many ways inferior to dependency injection. As before, the class is reaching outside its own scope to retrieve values it depends on.

What if we inject the generic Config object like this?

<?php
class Db
{
    public function __construct(Config $config)
    {
        $this->type = $config->get('db.type');
        $this->host = $config->get('db.host');
        $this->user = $config->get('db.user');
        $this->pass = $config->get('db.pass');
        $this->name = $config->get('db.name');
    }
}
?>

This is a little better; at least now we can tell that the Db class needs configuration of some sort, though we still cannot tell exactly which values it needs. This is the same as injecting a service locator.

Having seen all these examples, and other similar ones, in real codebases, I conclude that configuration values should be treated as any other dependency, and injected via the constructor. I suggest this approach:

<?php
class Db
{
    public function __construct($type, $host, $user, $pass, $name)
    {
        $this->type = $type;
        $this->host = $host;
        $this->user = $user;
        $this->pass = $pass;
        $this->name = $name;
    }
}
?>

Simple, clear, obvious, and easy to test. If you use a dependency injection container of some sort, it should be trivial to have it read environment variables and pass them to the Db class at construction time. (If your DI container does not support that kind of thing, you may wish to consider using a more powerful container system.)

Alternatively, I think the following may be reasonable in some cases:

<?php
class DbConfig
{
    // backend type, hostname, username, password, and database name
    protected $type, $host, $user, $pass, $name;

    public function __construct($type, $host, $user, $pass, $name)
    {
        $this->type = $type;
        $this->host = $host;
        $this->user = $user;
        $this->pass = $pass;
        $this->name = $name;
    }

    public function getDsn()
    {
        return "{$this->type}:host={$this->host};dbname={$this->name}";
    }

    public function getUser()
    {
        return $this->user;
    }

    public function getPass()
    {
        return $this->pass;
    }
}

class Db
{
    protected $dbConfig;

    public function __construct(DbConfig $dbConfig)
    {
        $this->dbConfig = $dbConfig;
    }

    public function newConnection()
    {
        return new PDO(
            $this->dbConfig->getDsn(),
            $this->dbConfig->getUser(),
            $this->dbConfig->getPass()
        );
    }
}
?>

In that example, the DbConfig manages a set of injected configuration values so that the Db object treats its own configuration as a separate concern. However, that approach is just a little too indirect and open-to-abuse for my taste most of the time. The temptation is to start putting more and more inside the DbConfig object, and you end up with a mini-service-locator.

To sum up: Configuration values are dependencies; therefore, inject configuration values the way you would any other dependency.

UPDATE: Stephan Hochdörfer notes on Twitter: “I would probably re-phrase a bit: Configuration values should be treated like deps. Not sure if u can say that they are deps ;).” The point is well-taken, though it may be a distinction without a difference. If the class cannot operate properly without a particular value, whether that value is a scalar or an object, I think it’s fair to say the class is dependent on that value.

Modernizing Serialized PHP Objects with class_alias()

Several weeks ago, a correspondent presented a legacy situation that I’ve never had to deal with. He was working his way through Modernizing Legacy Applications in PHP, and realized the codebase was storing serialized PHP objects in a database. He couldn’t refactor the class names without seriously breaking the application.

I was carefully moving my classes to a PSR-0/4 structure when I found this. The application saves the output of serialize($shoppingCart) in a BLOB column, and unserializes it later to recreate the ShoppingCart object. The serialized object string looks like this:

O:12:"ShoppingCart":17:{s:13:"\00Basket\00items";a:25:{...}; ...}

See how the class name is embedded in the serialized string? If I rename the ShopppingCart class to PSR-0/4 namespace, then the old class won’t be found when the application tries to unserialize() the serialized representation. How can I begin refactoring this without breaking the whole application?

Before I was able to reply, my correspondent ended up changing the serialization strategy to use JSON, which was a rather large change. It ended up well, but it turns out there is a less intrusive solution: class_alias().

If you’re in a serialized situation, and you need to change a class name, you can use a class_alias() to point the old class name to the new one. (Call class_alias() early in execution, probably before you register your autoloader.) You can then rename and move the class according to PSR-0/4 rules, and PHP will handle the rest for you.

For example, if you renamed ShoppingCart to Domain\Orders\Cart, you might do this:

class_alias('ShoppingCart', 'Domain\Orders\Cart');

Now when you call unserialize($shoppingCart) to create an object, PHP will create it as an instance of Domain\Orders\Cart instead of ShoppingCart. Re-serialized representations of the recreated object will retain the new class name, not the old one: O:18:"Domain\Orders\Cart":....

As soon as there are no more serialized representations using the old class name, you can remove the class_alias() call entirely.

MLAPHP Boot Camp!

You may have seen my introductory talk on steps toward modernizing a legacy application at Nashville PHP or at php[world]. That was the presentation that formed the first few chapters of Modernizing Legacy Applications in PHP.

The MLAPHP book has been a great help to a lot of people. But there are a lot of other developers who learn best by watching-and-listening first, instead of reading. They know it can be a great help to have someone talk them through the whole moderizing process, and see the steps being performed.

Usually you have to go to a conference for that sort of thing. You have to pay for the ticket, and for the hotel, and for travel, and meals, and everything else. Even if your employer helps with the money, there’s still the time involved: you have to be away for several days at a time.

Well, now you can have a conference-level training experience in the comfort of your own home, without the travel hassle and hotel expense. I have put together an online weekend “boot camp” where we go through the entire modernization process, from basic prerequisities at the beginning to setting up a DI container at the end. After that end-to-end study of the steps involved, you should be able to jump-start your own modernizing project much more easily.

The 8-hour camp will combine lecture and “live” coding with examples. In particular, there will be plenty of opportunity for you to get answers specific to your own modernizing situation, and to chat with other attendees.

The two-day camp will be on Sat 11 July and Sun 12 July. Each day will run from 12 noon to 4pm (US Central Daylight), with time for breaks built in. That makes it 10am-2pm for Pacific time zone attendees, and 1-5pm for Eastern, which should be convenient for almost everyone.

With your ticket to attend the boot camp, you get:

  • Two days of online boot camp, four hours each
  • A review of the entire modernization process from beginning to end
  • A bonus recording of the camp
  • For $399

This might be the only time I lead this kind of boot camp, so get your ticket while there are still seats available!

Modernizing Legacy PHP: From Service Locator To Dependency Injection

In an earlier article I described how to start moving away from singletons in favor of dependency injection. It occurs to me that the process for moving away from Service Locator is almost exactly the same, except that we use the container outside the class instead of inside it.

Let’s say we have a class that uses a Service Locator. First we examine the class for all uses of the locator. Then, we create constructor parameters for the dependencies it extracts from the locator, and add setter code for those dependencies in the constructor body. For example, we can convert the above Service Locator example classes to these dependency-injected variations:

<?php
class FooClass
{
    protected $db;
    public function __construct(Database $db)
    {
        $this->db = $db;
    }
}

class BarClass
{
    protected $db;
    public function __construct(Database $db)
    {
        $this->db = $db;
    }
}
?>

Finally, any time we instantiate one of these dependency-injected classes, we use the locator outside the class to retrieve the dependencies. We then pass them to the new call for the class. For example:

<?php
// for FooClass
$db = $container->get('db');
$foo = new FooClass($db);

// for BarClass
$db = StaticContainer::get('db');
$bar = new BarClass($db);
?>

Now the class dependencies are explicit and predictable, instead of implicit and unpredictable (i.e., the class might depend on any combination of dependencies hidden inside the container). It is also somewhat easier to build a test, since we only have to build the dependencies themselves, not the container that holds the dependencies.

Afterword

Are you overwhelmed by a legacy PHP application? Have you inherited a spaghetti mess of code? Does it use globals everywhere, so that a fix in one place causes a bug somewhere else? Does every feature addition feel like slogging through a swamp of includes?

It doesn’t have to be that way. “Modernizing Legacy Applications in PHP” gives you step-by-step instructions on how to get your legacy code under control by eliminating globals and separating concerns. Each chapter shows you exactly one task and how to accomplish it, along with common questions related to that task.

When you are done, you will come and go through your code like the wind. Your application will have become autoloaded, dependency injected, unit tested, layer separated, and front controlled. And you will have kept it running the whole time.

Buy the book today!

Legacy Refactor Question: What About Singletons?

In my talk and my book on refactoring legacy PHP code, one of the early steps is to start removing globals and replace them with dependency injection (not a container, just injecting dependencies by hand.) I addressed the use of the global keyword and the $_GET (et al) superglobals, but I did not address singletons directly. I had a Reddit correspondent ask about singletons recently, and how to refactor away from them; I answered on Reddit, where you can read the full conversation.

Quick Hits: ADR Pattern Progress, MLAPHP Softcover, Aura Notices

Just some short updates today, since last week was so very busy:

1. The “Action-Domain-Responder” (née “Action-Domain-Response”) refinement of the MVC pattern has received some positive criticism and attention. In addition to example code updates, I have added responses (heh) to critiques regarding the Resource-Method-Representation pattern, and will soon be adding a response regarding “Entity-Interactor-Boundary” (aka “Entity-Contol-Boundary”). Many thanks to the commenters who recognized that they were already doing something along the lines of ADR; this helps to validate the pattern as something that already occurs “in the wild.” If you like the pattern offering, please star it at Github.

2. Modernizing Legacy Applications in PHP is now available for purchase in softcover via Lulu.com.

3. It looks like the next two Aura v2 releases will be Aura.View (the view system, specifically the “reduced” branch) and Aura.Html (the HTML helper collection that can be used by any view system). I’ve done a lot of work on them in the past few days and they’re beginning to feel like they’re ready.

Cheers all!

How To Modernize Your Legacy PHP Application

It is accomplished: “Modernizing Legacy Applications in PHP”, the book that will help you modernize your legacy PHP codebase, is complete. You can get it now it at https://leanpub.com/mlaphp.


Is your legacy PHP application composed of page scripts placed directly in the document root of the web server? Do your page scripts, along with any other classes and functions, combine the concerns of model, view, and controller into the same scope? Is the majority of the logical flow incorporated as include files and global functions rather than class methods?

If so, you already know that the wide use of global variables means that making a change in one place leads to unexpected consequences somewhere else. These and other factors make it overly difficult and expensive for you to add features and fix bugs. Working with your legacy application feels like dragging your feet through mud.

But it doesn’t have to be that way! Modernizing Legacy Applications in PHP will show you how to modernize your application by extracting and replacing its legacy artifacts. We will use a step-by-step approach, moving slowly and methodically, to improve your application from the ground up.

Moreover, we will keep your application running the whole time. Each completed step in the process will keep your codebase fully operational with higher quality. When we are done, you will be able to breeze through your code like the wind. Your code will be autoloaded, dependency-injected, unit-tested, layer-separated, and front-controlled.


From the Foreword by Adam Culp:

Developing with PHP has really matured in recent years, but it’s no secret that PHP’s low level of entry for beginners helped create some nasty codebases. Companies who built applications in the dark times simply can’t afford to put things on hold and rebuild a legacy application, especially with today’s fast paced economy and higher developer salaries. To stay competitive, companies must continually push developers for new features and to increase application stability. This creates a hostile environment for developers working with a poorly written legacy application. Modernizing a legacy application is a necessity, and must happen. Yet knowing how to create clean code and comprehending how to modernize a legacy application are two entirely different things.

But understanding how to use these refactoring processes on a legacy codebase is not straight forward, and sometimes impossible. The book you’re about to read bridges the gap, allowing developers to modernize a codebase so refactoring can be applied for continued enhancement.


Early reviews and testimonials from the feedback page:

  • “This is one of those books that PHP developers from all skill levels will be able to glean value from, and I know after just a single read-through that it will be an oft-referenced resource when I need to convert my old legacy-based procedural code into something cleaner, object-oriented, and testable.” (J. Michael Ward)

  • “As I followed the exercises in the book, my questions almost seemed to be anticipated and answered before the chapter was over. Structurally the book is very well paced. Chapters that I breezed over tended to be more useful than I thought they would be. I’ve been testing for years and I still picked up some useful tidbits about structuring my tests. Just having the step-by-step advice of an expert really made a difference.” (James Fuller)

  • “Reading through the book, it feels like you’re pair programming with the author. I’m at the keyboard, driving, and the author is navigating, telling me where to go and what to do next. Each step is practical, self-contained and moves you closer to the end goal you seek: maintainable code.” (Joel Clermont)

  • “This book helped me slay a 300k line of code giant and has allowed me to break out my shell. The refactored code has 15% code coverage for unit testing which grows every day.” (Chris Smith)


If you feel overwhelmed by a legacy application, “Modernizing Legacy Applications in PHP” is the book for you. Purchase it today and get started making your own life easier!