Atlas: a persistence-model data mapper

Atlas is a data mapper implementation for your persistence model (not your domain model).

As such, Atlas uses the term “record” to indicate that its objects are not domain entities. Note that an Atlas record is a passive record, not an active record; it is disconnected from the database. Use Atlas records indirectly to populate your domain entities, or directly for simple data source interactions.

No migrations. No annotations. No lazy loading. No domain models. No data-type abstractions. No behaviors. No opinions. Just data source mapping.

Read more at the Github repository.

How Do You See The PHP-FIG?

Fram a Reddit discussion; please reply over there.

There are some ongoing discussions on the PHP-FIG mailing list about, among other things, how the FIG is seen by the wider PHP community.

Since an earlier discussion pointed out that perhaps the FIG, while well-known, don’t do enough “active outreach”, consider this an attempt to “reach out.”

Do you think:

  1. The FIG is a bunch of self-aggrandizing elitist jerks who couldn’t write a competent or useful “proposed standards recommendation” if their lives depended on it, and should disband entirely.

  2. The FIG, while aware that the wider PHP community is watching, writes PSRs primarily for itself, and others can adopt or ignore as they wish;

  3. The FIG has become the closest thing to a userland standards group that the PHP community has, and should accept that role;

  4. Some other opinion?

Thanks in advance for your thoughtful, considered opinion, whether positive or negative.

Again, please comment at Reddit.

Configuration Values Are Dependencies, Too

As part of my consulting work, I get the opportunity to review lots of different codebases of varying modernity. One thing I’ve noticed with some otherwise-modern codebases is that they often “reach out” from inside a class to retrieve configuration values, instead of injecting those values into the class from the outside. That is, they use an equivalent of globals or service-location to read configuration, instead of using dependency injection.

Here is one generic example:

<?php
class Db
{
    // backend type, hostname, username, password, and database name
    protected $type, $host, $user, $pass, $name;

    public function __construct()
    {
        $this->type = getenv('DB_TYPE');
        $this->host = getenv('DB_HOST');
        $this->user = getenv('DB_USER');
        $this->pass = getenv('DB_PASS');
        $this->name = getenv('DB_NAME');
    }

    public function newConnection()
    {
        return new PDO(
            "{$this->type}:host={$this->host};dbname={$this->name}",
            $this->user,
            $this->pass
        );
    }
}
?>

Granted, the example follows the modern practice of keeping sensitive information as environment variables. Similar examples use $_ENV or $_SERVER keys instead of getenv(). The effect, though, is global-ish or service-locator-ish in nature: the class is reaching outside its own scope to retrieve values it needs for its own operation. Likewise, one cannot tell from the outside the class what configuration values it depends on.

Is the following any better?

<?php
class Db
{
    public function __construct()
    {
        $this->type = Config::get('db.type');
        $this->host = Config::get('db.host');
        $this->user = Config::get('db.user');
        $this->pass = Config::get('db.pass');
        $this->name = Config::get('db.name');
    }
}
?>

As far as I can tell, that’s a variation on the same theme. The generic Config object acts as a global singleton to carry configuration for every possible need; it is acting as a static service locator. While service location is inversion-of-control, it is in many ways inferior to dependency injection. As before, the class is reaching outside its own scope to retrieve values it depends on.

What if we inject the generic Config object like this?

<?php
class Db
{
    public function __construct(Config $config)
    {
        $this->type = $config->get('db.type');
        $this->host = $config->get('db.host');
        $this->user = $config->get('db.user');
        $this->pass = $config->get('db.pass');
        $this->name = $config->get('db.name');
    }
}
?>

This is a little better; at least now we can tell that the Db class needs configuration of some sort, though we still cannot tell exactly which values it needs. This is the same as injecting a service locator.

Having seen all these examples, and other similar ones, in real codebases, I conclude that configuration values should be treated as any other dependency, and injected via the constructor. I suggest this approach:

<?php
class Db
{
    public function __construct($type, $host, $user, $pass, $name)
    {
        $this->type = $type;
        $this->host = $host;
        $this->user = $user;
        $this->pass = $pass;
        $this->name = $name;
    }
}
?>

Simple, clear, obvious, and easy to test. If you use a dependency injection container of some sort, it should be trivial to have it read environment variables and pass them to the Db class at construction time. (If your DI container does not support that kind of thing, you may wish to consider using a more powerful container system.)

Alternatively, I think the following may be reasonable in some cases:

<?php
class DbConfig
{
    // backend type, hostname, username, password, and database name
    protected $type, $host, $user, $pass, $name;

    public function __construct($type, $host, $user, $pass, $name)
    {
        $this->type = $type;
        $this->host = $host;
        $this->user = $user;
        $this->pass = $pass;
        $this->name = $name;
    }

    public function getDsn()
    {
        return "{$this->type}:host={$this->host};dbname={$this->name}";
    }

    public function getUser()
    {
        return $this->user;
    }

    public function getPass()
    {
        return $this->pass;
    }
}

class Db
{
    protected $dbConfig;

    public function __construct(DbConfig $dbConfig)
    {
        $this->dbConfig = $dbConfig;
    }

    public function newConnection()
    {
        return new PDO(
            $this->dbConfig->getDsn(),
            $this->dbConfig->getUser(),
            $this->dbConfig->getPass()
        );
    }
}
?>

In that example, the DbConfig manages a set of injected configuration values so that the Db object treats its own configuration as a separate concern. However, that approach is just a little too indirect and open-to-abuse for my taste most of the time. The temptation is to start putting more and more inside the DbConfig object, and you end up with a mini-service-locator.

To sum up: Configuration values are dependencies; therefore, inject configuration values the way you would any other dependency.

UPDATE: Stephan Hochdörfer notes on Twitter: “I would probably re-phrase a bit: Configuration values should be treated like deps. Not sure if u can say that they are deps ;).” The point is well-taken, though it may be a distinction without a difference. If the class cannot operate properly without a particular value, whether that value is a scalar or an object, I think it’s fair to say the class is dependent on that value.

First Stable Aura 3.x Releases

Today we released the first round of stable Aura 3.x packages:

Since the announcement of the plans for Aura 3.x, we have made one small concession: the minimum PHP version is 5.5, instead of 5.6 as originally announced. Even so, all the 3.x packages are tested and operational on PHP 5.6, PHP 7, and HHVM.

Via the Aura blog at http://auraphp.com/blog/2015/12/01/aura-3-stable-releases/.

SQL Schema Naming Conventions

Several weeks ago I asked on Twitter for SQL schema naming conventions from DBA professionals. (I’m always interested in the generally-accepted practices of related professions; when I can, I try to make my work as compatible with theirs as possible.)

I got back only a handful of responses, representing MySQL, PostgreSQL, and DB2 administrators, really not enough for a statistically useful sample. Even so, I’m going to present their anonymized responses here, because they led me to work I had not previously considered at length.

My questions were:

  1. For table names, do you prefer plural (posts), singular (post), or something else?

  2. For primary key column names, do you prefer plural (posts_id), singular (post_id), just plain id, or something else?

  3. How do you name many-to-many association tables? For example, if many posts relate to many tags, do you prefer combining the table names in plural or singular? If so, do you separate them with an underscore? The examples would be posts_tags for plural, and post_tag for singular. Or do you prefer another approach?

The answers follow.

Table Names

  • “Table and columns are singular, so create table item, account and not items, accounts.”

  • “Keep names singular. The reason behind that is that it was easy to reference column name with table name. Example: "user".first_name. The biggest challenge going with singular name is that most of the popular table names are considered keywords for the databases. Some of the examples: user, order, name, type etc.”

  • “Table names should be plural. That’s how I learned it, and it seems to make sense that a name for a collection of rows should be plural.”

  • “I prefer plural table names.”

  • “Plural – because it is a set of things.”

Primary Key Names

  • “Every table must have an id primary key (surrogate) using a sequence (identity is ok sometimes).”

  • “I prefer singular names for column without any prefix or suffix.”

  • “I would have said post_id but for the past several years I’ve switched to just id.”

  • “I prefer primary key always id.”

  • “Singular. For example, UserID.”

Association Table Names

  • “If I follow singular table names, I use post_tag_mapping. I like to use _mapping suffix to explicitly identify such tables.”

  • “We use plural_plural.”

  • “I prefer mapping tables singular.”

  • “I combine them as SingularPlural and generally have the dominant entity first as it owns things in the second entity. Ex: PostTags or UserRoles or StudentTests.”

What Does This Tell Us?

Not a whole lot, it seems. We might say “there’s no generally accepted practice” but with only 5 respondents that’s not a reliable conclusion.

Havig said that, one respondent summed up what seemed to be a common sentiment this way: “Most people will probably agree it’s about agreeing on a standard, and then being consistent with it.” I think that’s often the case with standards.

Another respondent noted, “Once upon a time you had production DBAs, and development ones that could do data modelling. These days it’s just production DBAs, and we always inherit designs as we come in later.” That certainly squares with my own experience. DBA professionals are generally hired much later as the business matures, and they’re stuck with whatever non-DBA-professional decisions were made before their arrived. The pre-existing schemas bind their hands.

What Would Joe Celko Do (WWJCD) ?

However, more than one respondent referred to Joe Celko‘s SQL Programming Style, which I immediately ordered and read through.

I thought Celko’s recommendations made a lot of sense. At first I thought I would have to copy the relevant sections here, but it turns out that Simon Holywell has already done so at his SQL Style Guide.

Celko’s answers to the above questions appear to be:

  1. For tables: “Use a collective name or, less ideally, a plural form. For example (in order of preference) staff and employees.” This one was especially interesting to me. The idea of using a collective name, not merely a plural name, makes a lot of sense to me, though it does not lend itself to automation.

  2. For primary key names: “Where possible avoid simply using id as the primary identifier for the table.” I gather from other reading that the recommendation is to use a natural identifier as a prefix; in the case of a posts table, that would be post_id.

  3. For association tables: “Avoid, where possible, concatenating two table names together to create the name of a relationship table. Rather than cars_mechanics prefer services.” On seeing it this way, it also makes sense to me, and I do not recall seeing it stated that way before.

Further, Celko lays out a series of uniform suffixes for column names. That by itself is pretty interesting.

Conclusion

If you’re starting a project from scratch, and are interested in following the advice of at least one SQL and DBA professional giant, you may wish to review the recommendations at http://www.sqlstyle.guide and try them out. Even better, buy Celko’s book. At the very least, by reading those recommendations, you’ll have gained a greater range of options to choose from.

UPDATE: If it was not clear from the introduction, this exercise was about discovering generally-accepted practices of DBA/SQL professionals (i.e., people whose primary job is to administer a database and write SQL schemas), not the preferences of application developers who happen to use SQL databases.

Frameworks Are Not Tools

A friend of mine, a long time ago, asked me why it is that “gun guys” are so interested in firearms in the first place. “They’re just tools,” he said. “I have a detached interest in my tools, like the knives I use for cooking, or the tools I use in my garden, but they’re not objects of endless comparison and discussion for me.”

My response was that a pistol, or a rifle, or any other modern firearm, is not merely a tool. It is a machine. It has interrelated interdependent moving parts that all work in concert. (Additionally, it is powered by explosives, which makes it even more interesting.)

My friend got the point after that. Tools are not especially interesting because they are not especially complex. But machines are fascinating because they are complex.

With that in mind, we have to realize that frameworks are not tools. Frameworks are machines. Each one has the code equivalent of interrelated interdependent moving parts. Frameworks are fascinating for the same reasons that machines are fascinating.

So the next time someone says “use the right tool for the job” and then mentions a framework, consider that the person making the framework suggestion might not be thinking about frameworks in the right way. Indeed, they might be thinking about something else entirely, and using “the framework” as a shorthand for whatever concept they really have in mind.

Service Classes, Payloads, and Responders

Revath Kumar has a good blog post up about extracting domain logic from controllers and putting that logic in a service class. After reading it, I commented that with a little extra work, it would be easy to modify the example to something closer to the Action-Domain-Responder pattern. In doing so, we would get a better separation of concerns (especially in presentation).

Using the code that Revath gives in his blog post as a basis, we can do the following:

  1. In the service class, instead of sometimes throwing exceptions and sometimes returning arrays, we always return Payload instances. These explicitly state the result of the domain activity (“input not valid”, “order created”, “order not created”, “error”). Making the status information explicit in the Payload means that we don’t need to catch exceptions in the controller action, and that we don’t need to examine the domain objects themselves to interpret what occurred in the domain. The service class can now say explicitly what occurred in a standardized way.

  2. In the controller action, now that we don’t need to catch exceptions, we can concentrate on a much smaller set of logic: get the user input, pass it to the domain, get back the domain payload, and pass the payload to a responder.

  3. Finally, we introduce a Responder, whose job is to build (and in this case send) the response. The responder logic ends up being simplified as well.

The modified code looks like this:

use Aura\Payload\Payload;

class OrdersController extends Controller {
  public function actionCreate() {
    $orderData = Yii::app()->request->getParam('order');
    $order = new OrdersService();
    $payload = $order->create($orderData);
    $responder = new OrdersResponder();
    $responder->sendCreateResponse($payload);
  }
}

class OrdersResponder extends Responder {
  public function sendCreateResponse($payload) {
    $result = array('messages' => $payload->getMessages());
    if ($payload->getStatus() === Payload::CREATED) {
      $this->_sendResponse(200, $result);
    } else {
      $result['status'] = 'error';
      $this->_sendResponse(403, $result);
    }
  }
}

class OrdersService {

  protected function newPayload($status) {
    return new Payload($status);
  }

  public function create($orderData) {
    if(empty($orderData['items'])) {
      return $this->newPayload(Payload::NOT_VALID)
        ->setMessages([
          "Order items can't be empty."
        ]);
    }
    $items = $orderData['items'];
    unset($orderData['items']);
    try {
      $order = new Orders;
      $orderTransaction = $order->dbConnection->beginTransaction();

      $address = Addresses::createIfDidntExist($orderData);
      unset($orderData['address']);
      $orderData['address_id'] = $address->id;
      $amount = 0;
      foreach ($items as $key => $item) {
        $amount += $item['total'];
      }
      $amount += $orderData['extra_charge'];
      $orderData['amount'] = $amount;
      $order->attributes = $orderData;
      if($order->save()) {
        if(OrderItems::batchSave($items, $order->id)) {
          $orderTransaction->commit();
          $this->sendMail($order->id);
          return $this->newPayload(Payload::CREATED)
            ->setMessages([
              "Order placed successfully."
            ]);
        }
        $orderTransaction->rollback();
        return $this->newPayload(Payload::NOT_CREATED)
          ->setMessages([
            "Failed to save the items."
          ]);
      }
      else {
        // handle validation errors
        $orderTransaction->rollback();
        return $this->newPayload(Payload::ERROR)
          ->setMessages($order->getErrors());
      }
    }
    catch(Exception $e) {
      $orderTransaction->rollback();
      return $this->newPayload(Payload::ERROR)
        ->setMessages([
          "Something wrong happened"
        ]);
    }
  }
}

Now, there are still obvious candidates for improvement here. For example, we could begin separating the controller action methods into their own individual action classes. But baby steps are the right way to go when refactoring.

This small set of changes gives us a better separation of concerns, especially in terms of presentation. Remember, the “presentation” in a request/response environment is the entire HTTP response, not just the response body. The above changes make it so that HTTP headers and status code presentation work are no longer mixed in with the controller; they are now handled by a separate Responder object.

A Factory Should Create, Not Retain

In a recent Reddit conversation, some of us went off on a tangent about factories. I maintained then, and do now, that a “factory” always-and-only returns a new instance. If you have a “factory” that returns anything other than a new instance, it’s not a factory alone. In the case of factory methods, it is a factory + accessor; in the case of factory objects, it is a factory + registry.

A “factory” (whether a factory method or factory object) is one way to separate object creation from object use.

Let’s say we need to create an object to do some work in a class method:

class Example
{
    public function __construct($db)
    {
        $this->db = $db;
    }

    public function doSomething($itemId)
    {
        $data = $this->db->fetchOne(
            "SELECT * FROM items WHERE id = :id",
            ['id' => $itemId]
        );
        $item = new Item($data);

        // do some other work with the item,
        // the return the results of that work.
        return $results;
    }
}

The item creation is mixed in with the use of the item. What we want to do is separate the creation from the usage. One way to do that is to use a factory method, or a factory object, to handle object creation for us. Here’s an example of a factory method:

class Example
{
    public function __construct($db)
    {
        $this->db = $db;
    }

    public function doSomething($itemId)
    {
        $data = $this->db->fetchOne(
            "SELECT * FROM items WHERE id = ?",
            ['id' => $id]
        );
        $item = $this->item($itemId);

        // do some other work with the item,
        // the return the results of that work.
        return $results;
    }

    public function item($data) // factory method
    {
        return new Item($data);
    }
}

Here’s an example of an injected factory object:

class Example
{
    public function __construct($db, $factory)
    {
        $this->db = $db;
        $this->factory = $factory;
    }

    public function doSomethingWithItem($id)
    {
        $data = $this->db->fetchOne(
            "SELECT * FROM items WHERE id = ?",
            ['id' => $id]
        );
        $item = $this->factory->item($data);

        // do some other work with the item,
        // the return the results of that work.
        return $results;
    }
}

class Factory // factory object
{
    public function item($data) // factory method
    {
        return new Item($data);
    }
}

There were some folks in the Reddit thread that believe a factory may additionally retain the created instance for reuse. For example, let’s say we create and the reuse a collaborator object of some sort:

class Example
{
    public function doSomething()
    {
        $collab = $this->collab();
        // do stuff, then:
        return $result;
    }

    protected function collab() // factory method?
    {
        if (! $this->collab) {
            $this->collab = new Collab();
        }
        return $this->collab;
    }
}

As far as I can tell, that’s not just a factory. There are three things going on there: initializing a property, creating an object, and accessing a property. To split the concerns, one would need two methods:

class Example
{
    public function doSomething()
    {
        $collab = $this->getCollab();
        // do stuff, then:
        return $result;
    }

    protected function getCollab() // initialize + access
    {
        if (! $this->collab) {
            $this->collab = $this->newCollab();
        }
        return $this->collab;
    }

    protected function newCollab() // factory
    {
        return new Collab();
    }
}

Now the concern of creating the object is represented by newCollab() method, and the concerns of initializing and accessing the property are represented by the getCollab() method. Indeed, splitting out a factory method in this kind of situation is the exact recommendation in the GOF Design Patterns book on page 113; this is the book that defined the term “factory method” in the first place, so it seems authoritative on this point.

What happens when you add retention to what would otherwise be a factory object? For example, is the following only a factory object, or does it do more than just create and return new instances?

class WhatIsThatThing
{
    public function getService()
    {
        if (! $this->service) {
            $this->service = $this->newService();
        }
        return $this->service;
    }

    public function newService()
    {
        return new Service();
    }
}

It both creates objects, and retains the created objects. That makes it more than just a factory; it looks more like a container at this point.

Finally, regarding method names, it seems like it’s best to indicate what to expect as a return result. I default to newInstance() in factory objects that deal with only a single class, and new<Type>() when the factory object creates many different classes. Using get<Type>() indicates that a pre-existing instance (probably memoized in a property) is being returned.