Paul M. Jones

Don't listen to the crowd, they say "jump."

How Do You See The PHP-FIG?

Fram a Reddit discussion; please reply over there.

There are some ongoing discussions on the PHP-FIG mailing list about, among other things, how the FIG is seen by the wider PHP community.

Since an earlier discussion pointed out that perhaps the FIG, while well-known, don’t do enough “active outreach”, consider this an attempt to “reach out.”

Do you think:

  1. The FIG is a bunch of self-aggrandizing elitist jerks who couldn’t write a competent or useful “proposed standards recommendation” if their lives depended on it, and should disband entirely.

  2. The FIG, while aware that the wider PHP community is watching, writes PSRs primarily for itself, and others can adopt or ignore as they wish;

  3. The FIG has become the closest thing to a userland standards group that the PHP community has, and should accept that role;

  4. Some other opinion?

Thanks in advance for your thoughtful, considered opinion, whether positive or negative.

Again, please comment at Reddit.


How To Think Real Good

Via http://meaningness.com/metablog/how-to-think. All of the following are quotes from the article that, in general, appeal to my priors. All emphasis in original, which you should read in its entirety.


The implicit assumption is that the problem Bayesianism solves is most of rationality, and if I’m unimpressed with Bayesianism, I must advocate some other solution to that problem. I do have technical doubts about Bayesianism, but that’s not my point. Rather, I think that the problem Bayesianism addresses is a small and easy one.

- Bayesianism is a theory of probability.

- Probability is only a small part of epistemology.

- Probability is only a small part of rationality.

- Probability is a solved problem. It’s easy. The remaining controversies in the field are arcane and rarely have any practical consequence.

My answer to “If not Bayesianism, then what?” is: all of human intellectual effort.

* * *

Understanding informal reasoning is probably more important than understanding technical methods.

* * *

Many of the heuristics I collected for “How to think real good” were about how to take an unstructured, vague problem domain and get it to the point where formal methods become applicable. ... Finding a good formulation for a problem is often most of the work of solving it.

* * *

Suppose you want to understand the cause of manic depression. For every grain of sand in the universe, there is the hypothesis that this particular grain of sand is the sole cause of manic depression. Finding evidence to rule out each one individually is impractical. ... [T]here is an infinite list of logically possible causes. ... We can’t even imagine them all, much less evaluate the evidence for them. So:

Before applying any technical method, you have to already have a pretty good idea of what the form of the answer will be.

* * *

Choosing a good vocabulary, at the right level of description, is usually key to understanding.

* * *

1. A successful problem formulation has to make the distinctions that are used in the problem solution.

...

2. A successful problem formulation has to make the problem small enough that it’s easy to solve.

* * *

It’s important to understand that problem formulations are never right or wrong.

Truth does not apply to problem formulations; what matters is usefulness.

In fact,

All problem formulations are “false,” because they abstract away details of reality.

* * *

[I]f you don’t know the solution to a problem, how do you know whether your vocabulary makes the distinctions it needs? The answer is: you can’t be sure; but there are many heuristics that make finding a good formulation more likely. Here are two very general ones:

Work through several specific examples before trying to solve the general case. Looking at specific real-world details often gives an intuitive sense for what the relevant distinctions are.

Problem formulation and problem solution are mutually-recursive processes.

You need to go back and forth between trying to formulate the problem and trying to solve it.

* * *

If a problem seems too hard, the formulation is probably wrong. Drop your formal problem statement, go back to reality, and observe what is going on.

* * *

Learn from fields very different from your own. They each have ways of thinking that can be useful at surprising times. Just learning to think like an anthropologist, a psychologist, and a philosopher will beneficially stretch your mind.

...

If you only know one formal method of reasoning, you’ll try to apply it in places it doesn’t work.

* * *

- Figuring stuff out is way hard.

- There is no general method.

- Selecting and formulating problems is as important as solving them; these each require different cognitive skills.

- Problem formulation (vocabulary selection) requires careful, non-formal observation of the real world.

- A good problem formulation includes the relevant distinctions, and abstracts away irrelevant ones. This makes problem solution easy.

- Little formal tricks (like Bayesian statistics) may be useful, but any one of them is only a tiny part of what you need.

- Progress usually requires applying several methods. Learn as many different ones as possible.

- Meta-level knowledge of how a field works--which methods to apply to which sorts of problems, and how and why--is critical (and harder to get).



Configuration Values Are Dependencies, Too

As part of my consulting work, I get the opportunity to review lots of different codebases of varying modernity. One thing I’ve noticed with some otherwise-modern codebases is that they often “reach out” from inside a class to retrieve configuration values, instead of injecting those values into the class from the outside. That is, they use an equivalent of globals or service-location to read configuration, instead of using dependency injection.

Here is one generic example:

<?php
class Db
{
    // backend type, hostname, username, password, and database name
    protected $type, $host, $user, $pass, $name;

    public function __construct()
    {
        $this->type = getenv('DB_TYPE');
        $this->host = getenv('DB_HOST');
        $this->user = getenv('DB_USER');
        $this->pass = getenv('DB_PASS');
        $this->name = getenv('DB_NAME');
    }

    public function newConnection()
    {
        return new PDO(
            "{$this->type}:host={$this->host};dbname={$this->name}",
            $this->user,
            $this->pass
        );
    }
}
?>

Granted, the example follows the modern practice of keeping sensitive information as environment variables. Similar examples use $_ENV or $_SERVER keys instead of getenv(). The effect, though, is global-ish or service-locator-ish in nature: the class is reaching outside its own scope to retrieve values it needs for its own operation. Likewise, one cannot tell from the outside the class what configuration values it depends on.

Is the following any better?

<?php
class Db
{
    public function __construct()
    {
        $this->type = Config::get('db.type');
        $this->host = Config::get('db.host');
        $this->user = Config::get('db.user');
        $this->pass = Config::get('db.pass');
        $this->name = Config::get('db.name');
    }
}
?>

As far as I can tell, that’s a variation on the same theme. The generic Config object acts as a global singleton to carry configuration for every possible need; it is acting as a static service locator. While service location is inversion-of-control, it is in many ways inferior to dependency injection. As before, the class is reaching outside its own scope to retrieve values it depends on.

What if we inject the generic Config object like this?

<?php
class Db
{
    public function __construct(Config $config)
    {
        $this->type = $config->get('db.type');
        $this->host = $config->get('db.host');
        $this->user = $config->get('db.user');
        $this->pass = $config->get('db.pass');
        $this->name = $config->get('db.name');
    }
}
?>

This is a little better; at least now we can tell that the Db class needs configuration of some sort, though we still cannot tell exactly which values it needs. This is the same as injecting a service locator.

Having seen all these examples, and other similar ones, in real codebases, I conclude that configuration values should be treated as any other dependency, and injected via the constructor. I suggest this approach:

<?php
class Db
{
    public function __construct($type, $host, $user, $pass, $name)
    {
        $this->type = $type;
        $this->host = $host;
        $this->user = $user;
        $this->pass = $pass;
        $this->name = $name;
    }
}
?>

Simple, clear, obvious, and easy to test. If you use a dependency injection container of some sort, it should be trivial to have it read environment variables and pass them to the Db class at construction time. (If your DI container does not support that kind of thing, you may wish to consider using a more powerful container system.)

Alternatively, I think the following may be reasonable in some cases:

<?php
class DbConfig
{
    // backend type, hostname, username, password, and database name
    protected $type, $host, $user, $pass, $name;

    public function __construct($type, $host, $user, $pass, $name)
    {
        $this->type = $type;
        $this->host = $host;
        $this->user = $user;
        $this->pass = $pass;
        $this->name = $name;
    }

    public function getDsn()
    {
        return "{$this->type}:host={$this->host};dbname={$this->name}";
    }

    public function getUser()
    {
        return $this->user;
    }

    public function getPass()
    {
        return $this->pass;
    }
}

class Db
{
    protected $dbConfig;

    public function __construct(DbConfig $dbConfig)
    {
        $this->dbConfig = $dbConfig;
    }

    public function newConnection()
    {
        return new PDO(
            $this->dbConfig->getDsn(),
            $this->dbConfig->getUser(),
            $this->dbConfig->getPass()
        );
    }
}
?>

In that example, the DbConfig manages a set of injected configuration values so that the Db object treats its own configuration as a separate concern. However, that approach is just a little too indirect and open-to-abuse for my taste most of the time. The temptation is to start putting more and more inside the DbConfig object, and you end up with a mini-service-locator.

To sum up: Configuration values are dependencies; therefore, inject configuration values the way you would any other dependency.

UPDATE: Stephan Hochdörfer notes on Twitter: "I would probably re-phrase a bit: Configuration values should be treated like deps. Not sure if u can say that they are deps ;)." The point is well-taken, though it may be a distinction without a difference. If the class cannot operate properly without a particular value, whether that value is a scalar or an object, I think it's fair to say the class is dependent on that value.



Stop Fighting ISIS, Start Fighting Saudi Arabia

But ISIS is only a symptom of the larger disease, which is the spread of fundamentalist Wahhabist Islam from Saudi Arabia all over the world. This has become such a problem that even Germany -- which has precipitated the current "migrant" crisis in central and western Europe -- has publicly warned the Saudis against their fifth-column work. ...

Until Saudi Arabia is forcefully and directly confronted over its international financing of extremism, events like Paris and San Bernardino will continue and multiply.

Also, "The United States is not a nation-state in the sense the European countries are; it is not a country of blood relations, but of fealty to a document of western, Enlightenment principles regarding the relationship of citizen and state." Source: End the War on ISIS Now.


First Stable Aura 3.x Releases

Today we released the first round of stable Aura 3.x packages:

Since the announcement of the plans for Aura 3.x, we have made one small concession: the minimum PHP version is 5.5, instead of 5.6 as originally announced. Even so, all the 3.x packages are tested and operational on PHP 5.6, PHP 7, and HHVM.

Via the Aura blog at http://auraphp.com/blog/2015/12/01/aura-3-stable-releases/.



SQL Schema Naming Conventions

Several weeks ago I asked on Twitter for SQL schema naming conventions from DBA professionals. (I'm always interested in the generally-accepted practices of related professions; when I can, I try to make my work as compatible with theirs as possible.)

I got back only a handful of responses, representing MySQL, PostgreSQL, and DB2 administrators, really not enough for a statistically useful sample. Even so, I'm going to present their anonymized responses here, because they led me to work I had not previously considered at length.

My questions were:

  1. For table names, do you prefer plural (posts), singular (post), or something else?

  2. For primary key column names, do you prefer plural (posts_id), singular (post_id), just plain id, or something else?

  3. How do you name many-to-many association tables? For example, if many posts relate to many tags, do you prefer combining the table names in plural or singular? If so, do you separate them with an underscore? The examples would be posts_tags for plural, and post_tag for singular. Or do you prefer another approach?

The answers follow.

Table Names

  • "Table and columns are singular, so create table item, account and not items, accounts."

  • "Keep names singular. The reason behind that is that it was easy to reference column name with table name. Example: "user".first_name. The biggest challenge going with singular name is that most of the popular table names are considered keywords for the databases. Some of the examples: user, order, name, type etc."

  • "Table names should be plural. That's how I learned it, and it seems to make sense that a name for a collection of rows should be plural."

  • "I prefer plural table names."

  • "Plural - because it is a set of things."

Primary Key Names

  • "Every table must have an id primary key (surrogate) using a sequence (identity is ok sometimes)."

  • "I prefer singular names for column without any prefix or suffix."

  • "I would have said post_id but for the past several years I've switched to just id."

  • "I prefer primary key always id."

  • "Singular. For example, UserID."

Association Table Names

  • "If I follow singular table names, I use post_tag_mapping. I like to use _mapping suffix to explicitly identify such tables."

  • "We use plural_plural."

  • "I prefer mapping tables singular."

  • "I combine them as SingularPlural and generally have the dominant entity first as it owns things in the second entity. Ex: PostTags or UserRoles or StudentTests."

What Does This Tell Us?

Not a whole lot, it seems. We might say "there's no generally accepted practice" but with only 5 respondents that's not a reliable conclusion.

Havig said that, one respondent summed up what seemed to be a common sentiment this way: "Most people will probably agree it's about agreeing on a standard, and then being consistent with it." I think that's often the case with standards.

Another respondent noted, "Once upon a time you had production DBAs, and development ones that could do data modelling. These days it's just production DBAs, and we always inherit designs as we come in later." That certainly squares with my own experience. DBA professionals are generally hired much later as the business matures, and they're stuck with whatever non-DBA-professional decisions were made before their arrived. The pre-existing schemas bind their hands.

What Would Joe Celko Do (WWJCD) ?

However, more than one respondent referred to Joe Celko's SQL Programming Style, which I immediately ordered and read through.

I thought Celko's recommendations made a lot of sense. At first I thought I would have to copy the relevant sections here, but it turns out that Simon Holywell has already done so at his SQL Style Guide.

Celko's answers to the above questions appear to be:

  1. For tables: "Use a collective name or, less ideally, a plural form. For example (in order of preference) staff and employees." This one was especially interesting to me. The idea of using a collective name, not merely a plural name, makes a lot of sense to me, though it does not lend itself to automation.

  2. For primary key names: "Where possible avoid simply using id as the primary identifier for the table." I gather from other reading that the recommendation is to use a natural identifier as a prefix; in the case of a posts table, that would be post_id.

  3. For association tables: "Avoid, where possible, concatenating two table names together to create the name of a relationship table. Rather than cars_mechanics prefer services." On seeing it this way, it also makes sense to me, and I do not recall seeing it stated that way before.

Further, Celko lays out a series of uniform suffixes for column names. That by itself is pretty interesting.

Conclusion

If you're starting a project from scratch, and are interested in following the advice of at least one SQL and DBA professional giant, you may wish to review the recommendations at http://www.sqlstyle.guide and try them out. Even better, buy Celko's book. At the very least, by reading those recommendations, you'll have gained a greater range of options to choose from.

UPDATE: If it was not clear from the introduction, this exercise was about discovering generally-accepted practices of DBA/SQL professionals (i.e., people whose primary job is to administer a database and write SQL schemas), not the preferences of application developers who happen to use SQL databases.



How To Think About HTTP Middleware

HTTP middleware is a user interface decoration system, where the user interface is the HTTP request (input) and HTTP response (output).

HTTP middleware is not for your Domain work. The middleware is a path in to, and out of, the core Domain.



Why They Sent Ahmed To Juvie

Multiculturalism eliminates any shared sense of rules beyond an ever increasing tangle of bureaucratic doctrines. The administrators who sent him to a detention center were almost certainly following strict rules about how to respond to students bringing unidentifiable electronic devices into school -- those rules having been created by hysterical liberals terrified by the acts of terror committed by youths addled by prescription drugs and seeking a glorious death with huge media attention.

In order to make room for Ahmed, Jamal, J’miriquoi, Running Bear, Jorge, and Moonbeam, we subject all of them -- including lil’ Johnny the racist cracker -- to the same set of regulations, because we see all of them as potential malefactors to be treated uniformly by a blind system.

Source: Why They Sent Ahmed To Juvie - Henry Dampier