PDS “skeleton” Standard Now In Beta

I am proud to announce the immediate availability of pds/skeleton 1.0.0-beta1 for inclusion in your PHP package.

The pds/skeleton standard defines specific names for specific kinds of root-level directories in PHP packages. If you use all the directory and file names in the standard, your package will look something like this:

bin/            # executable files
config/         # configuration files
docs/           # documentation
public/         # web server files
resources/      # other resource files
src/            # PHP source code
tests/          # tests
CHANGELOG       # change notes
CONTRIBUTING    # contribution guidelines
LICENSE         # licensing and copyright
README          # read-me-first file

Believe it or not, roughly 78 thousand packages on Packagist, from tens of thousands of vendors, already appear compliant with pds/skeleton.

To formally indicate that your package attempts to comply to pds/skeleton, add it to your package manifest as a development requirement. For example, if you use Composer, you can do something like the following:

"require-dev": {
    "pds/skeleton": "~1.0"
}

(Alternatively, mention in your README that the package attempts to comply with pds/skeleton.)

This standard will remain in beta for at least two weeks, to give ample time for additional public review. If you find errors or omissions, please be sure to open an issue on it.


As a side note, Producer will end up supporting pds/skeleton and validating against it by default. This will mean a BC break for Producer, so look for a 3.x version soon!

ADR with HTTP Caching

Great conversation with “MatTheCat” on Github today, regarding HTTP caching headers and the ADR pattern.

A quick summary:

  • You want to return a 304 “Not Modified” response if the proper preconditions are met.
  • You can only tell if the resource is unmodified if you touch some form of storage.
  • In ADR, one rule of thumb is “if it touches storage, it probably goes in the Domain.”
  • So you need to do all the precondition-checking in the Domain, not in the Action.
  • The Domain can return a payload indicating “not modified” as appropriate, and the Responder can read that payload status to send back a 304.

You can read the whole thing here. Thanks for the great question Mathieu!

Efficient use of mysqli_result::$num_rows

I frequently see this pattern in legacy applications using mysqli:

$result = $mysqli->query("
    SELECT *
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->num_rows > 0) {
    return;
}

The developer’s intent here is to see if there are any rows at all in the database that match a certain condition. He does so by issuing a query, then asking the result object how many rows it has. The developer doesn’t actually want any data from the result, and doesn’t care about the actual row-count itself; this is just a check to see if at least one row exists in the database.

This is a poor conservation of resources. The database does the work needed to select all the columns for all the rows matching the conditions, allocates memory for them, and returns them. But the developer discards all that immediately.

To accomplish the same end, it is less resource-intensive and just as effective to query for a single column and limit the results to a single row:

$result = $mysqli->query("
    SELECT col_name
    FROM table_name
    WHERE conditions = 'whatever'
    LIMIT 1
");
if ($result && $result->num_rows > 0) {
    return;
}

Now the database only does the work needed for a single column and a single row.

(As a side note, I find it interesting that I have not seen this pattern at all in projects using PDO. I’m not sure why this would be. Perhaps there is some originating example code for mysqli somewhere that has gained a life of its own through copying and reuse.)

UPDATE: Perhaps a better way to conserve resources, courtesy of Reddit user marcjschmidt, is to use a COUNT() in the query, then fetch the count of rows, something more like this …

$result = $mysqli->query("
    SELECT COUNT(*)
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->fetch_array()[0] > 0) {
    return;
}

… thereby avoiding the use of mysqli_result::$num_rows completely.

UPDATE 2: Lots of commentary in the above Reddit thread. To summarize this blog post: selecting all columns of all rows, then examining $num_rows, and then discarding the result set, is a terrible way of determining if there are any matching rows at all in the database. It is trivially easy to something much better, whether by using a LIMIT 1 and $num_rows, or some form of COUNT(), or perhaps some other approach.

PHP-PDS: Interview on Voices of the ElePHPant

My good friend Cal Evans interviewed me about the PHP-PDS initiative and its first offering, pds/skeleton last week; here it is for you to enjoy.

(Via https://voicesoftheelephpant.com/2016/12/20/interview-paul-m-jones/.)

While we’re here, you might want to check out his virtual user group Nomad PHP, and the series of one-day virtual training conferences at DayCamp4Developers, because you’ll probably learn something useful there.

PECL Request Extension: Beta 1 Released!

I am happy to announce that the PECL extension for server-side request and response objects has reached beta status! (Documentation here.)

This release adds four new read-only properties to the ServerRequest object:

  • $forwarded is an array representation of $_SERVER['HTTP_FORWARDED'] (cf. the Forwarded HTTP Extension),

  • $forwardedFor is an array computed from treating $_SERVER['HTTP_X_FORWARDED_FOR'] as comma-separated values.

  • $forwardedHost is the $_SERVER['HTTP_X_FORWARDED_HOST'] value, and

  • $forwardedProto is $_SERVER['HTTP_X_FORWARDED_PROTO'] value.

If you use trusted proxies a lot in your work, you can use those values to compute the “real” client IP address in your application.

This completes the intended initial functionality of the extension. You should install it and try it out, because it might be make your work easier.

Package Development Standards: “pds/skeleton” Now Open For Review!

The new Package Development Standards initiative is proud to present its first publication, pds/skeleton (and the related research) for public review. If you are a package author, you are invited to post your comments and criticisms of the publication as issues on the relevant Github repository.

The pds/skeleton publication describes a set of standard top-level PHP package directories and files. If you are an author of more than three packages on Packagist, chances are you already follow the standard! That’s because PDS initiative researches the PHP package ecosystem to recognize commonly adopted development practices. (See this list of over than 47,000 packages; if your package is there, it has been tentatively noted as already compliant.)

The public review period will last at least 2 weeks, perhaps longer. If there are no substantial errors revealed during that time, the standard will advance to “beta” status for adoption, and then to “stable” if no further errors are discovered in use.

Meanwhile, if you review the publication and determine that your package follows it, you should add pds/skeleton to your Composer “require-dev” block, because that will help the initiative track formal adoption rates.

The Iron Law of (Open Source) Bureaucracy

Something to watch out for in open-source projects: Pournelle’s “Iron Law of Bureaucracy.” In the organization around your project, you will find two kinds of people:

  1. Those who concentrate on the goals of the organization. These are generally coders, devops, and infrastructure folks.
  2. Those who concentrate on the organization itself. These are generally managers of people within the organization, and controllers of how & when the organization communicates within itself and outside itself.

Pournelle’s Iron Law asserts that group 2 will always end up in control of the organization.

Police your membership accordingly.

PHP file_get_contents() HTTPS/SSL error on Mac OS

Problem on Mac OS Sierra (10.12.x): Using file_get_contents() for an https resource fails with error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed.

This solution appears to be legit, although I am not primarily a security guy, so real security folks should chime in if they seem something untoward:

% sudo mkdir -p /usr/local/libressl/etc/ssl/certs
% sudo curl -o /usr/local/libressl/etc/ssl/cert.pem https://curl.haxx.se/ca/cacert.pem

This creates the default_cert_file directories and imports the latest certificate store from curl.haxx.se.

Source: PHP on macOS Sierra can’t access SSL data – andrewyager.com

Independent Packages and Subtree Splits

You’ll sometimes see a PHP package hosted in a Github repository with the heading or subtitle “[READ ONLY] Subtree Split”. This indicates that the package is actually copied from another codebase (usually a framework) and is not intended to be worked on separately from that other codebase.

As such, a “subtree split” is not necessarily a sign of an independent package. Using a subtree split to publish a package says to me that the authors’ concentration is on the framework-as-a-whole, not on the package in-and-of- itself.

For example, Symfony does subtree splits for its components, as does Laravel for its Illuminate components. Those packages from the frameworks are not developed independently; they only move forward as part of the whole framework.

In these cases, you often end up with composer.json in the framework origin directories, which is not something I generally expect. Further, the framework subdirectories may have their own src/tests/docs/etc. directories. They are there so that the subtree split can have them available at their own top level, but in the origin framework, it is again something I find unexpected.

I say: if you’re going to advertise independent packages, actually write them independently. Let them be their own thing. Aura has done it that way since its beginning, and Zend Framework converted to that approach in version 3. Then you can compose the truly independent packages into a framework, instead of subtree-splitting your framework into pseudo-independent packages that are still bound to the origin framework development and release process.

PHP ssh2.sftp opendir/readdir fix

This bug https://bugs.php.net/bug.php?id=73597 related to the PECL ssh2 extension bit us yesterday, so this post is a public service announcement that will (hopefully) save you from writing your own workaround like I almost did.

Problem: PHP 5.6.28 (and apparently 7.0.13) introduced a security fix to URL parsing, that caused the string interpolation of the $sftp resource handle to no-longer be recognized as a valid URL. In turn, that causes opendir(), readdir(), etc. to fail when you use an $sftp resource in the path string, after an upgrade to one of those PHP versions.

Solution: Instead of using "ssh2.sftp://$sftp" as a stream path, convert $sftp to an integer like so: "ssh2.sftp://" . intval($sftp) . "/". Then it will work just fine.

Thanks to the people who fixed the URL parsing security flaw, thanks to the people who wrote the PECL ssh2 extension, and thanks to the people who provided the fix.