RFC: ServerRequest and ServerResponse

The RFC is at https://wiki.php.net/rfc/request_response.

The message opening discussion on Internals is at http://news.php.net/php.internals/97461.

The extension itself is available at https://pecl.php.net/package/request, with documentation at https://gitlab.com/pmjones/ext-request.

(Many thanks to John Boehr for doing the actual heavy lifting of writing the C code.)


Nearly every PHP framework and library-collection since 2000 has had classes to encapsulate the “request” and “response” elements of a PHP application. A handful of examples include:

There are many others. They all do essentially the same things:

  • Copy the $_GET, $_POST, $_SERVER, etc. superglobals into a “request” object. Some make them available through a method that standardizes the logic to get a default value when a key is not present, a la return (isset($_GET[$key]) ? $_GET[$key] : $defaultValue).

  • Add convenience methods to the “request” object so that you can determine the HTTP method, the values of various headers, and so on.

  • The “response” object is a place to hold headers, cookies, status, and content, so they can all be inspected and modified before sending, and to make testing easier. (This is because the header(), setcookie(), etc. functions in PHP are not especially amenable to inspection, modification, and testing – at least, not without being wrapped somehow.)

  • The “response” object often has some convenience methods to send JSON content, send files for download, and so on.

Why do framework and library-collection authors write these request and response objects? Because PHP, even though it is a web-centric programming language, and even though it provides all sorts of classes for all sorts of functionality, it has never had classes for server-side requests and responses. This RFC helps to improve this situation in PHP 7 and later.

PDS “skeleton” Standard Now In Beta

I am proud to announce the immediate availability of pds/skeleton 1.0.0-beta1 for inclusion in your PHP package.

The pds/skeleton standard defines specific names for specific kinds of root-level directories in PHP packages. If you use all the directory and file names in the standard, your package will look something like this:

bin/            # executable files
config/         # configuration files
docs/           # documentation
public/         # web server files
resources/      # other resource files
src/            # PHP source code
tests/          # tests
CHANGELOG       # change notes
CONTRIBUTING    # contribution guidelines
LICENSE         # licensing and copyright
README          # read-me-first file

Believe it or not, roughly 78 thousand packages on Packagist, from tens of thousands of vendors, already appear compliant with pds/skeleton.

To formally indicate that your package attempts to comply to pds/skeleton, add it to your package manifest as a development requirement. For example, if you use Composer, you can do something like the following:

"require-dev": {
    "pds/skeleton": "~1.0"
}

(Alternatively, mention in your README that the package attempts to comply with pds/skeleton.)

This standard will remain in beta for at least two weeks, to give ample time for additional public review. If you find errors or omissions, please be sure to open an issue on it.


As a side note, Producer will end up supporting pds/skeleton and validating against it by default. This will mean a BC break for Producer, so look for a 3.x version soon!

Efficient use of mysqli_result::$num_rows

I frequently see this pattern in legacy applications using mysqli:

$result = $mysqli->query("
    SELECT *
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->num_rows > 0) {
    return;
}

The developer’s intent here is to see if there are any rows at all in the database that match a certain condition. He does so by issuing a query, then asking the result object how many rows it has. The developer doesn’t actually want any data from the result, and doesn’t care about the actual row-count itself; this is just a check to see if at least one row exists in the database.

This is a poor conservation of resources. The database does the work needed to select all the columns for all the rows matching the conditions, allocates memory for them, and returns them. But the developer discards all that immediately.

To accomplish the same end, it is less resource-intensive and just as effective to query for a single column and limit the results to a single row:

$result = $mysqli->query("
    SELECT col_name
    FROM table_name
    WHERE conditions = 'whatever'
    LIMIT 1
");
if ($result && $result->num_rows > 0) {
    return;
}

Now the database only does the work needed for a single column and a single row.

(As a side note, I find it interesting that I have not seen this pattern at all in projects using PDO. I’m not sure why this would be. Perhaps there is some originating example code for mysqli somewhere that has gained a life of its own through copying and reuse.)

UPDATE: Perhaps a better way to conserve resources, courtesy of Reddit user marcjschmidt, is to use a COUNT() in the query, then fetch the count of rows, something more like this …

$result = $mysqli->query("
    SELECT COUNT(*)
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->fetch_array()[0] > 0) {
    return;
}

… thereby avoiding the use of mysqli_result::$num_rows completely.

UPDATE 2: Lots of commentary in the above Reddit thread. To summarize this blog post: selecting all columns of all rows, then examining $num_rows, and then discarding the result set, is a terrible way of determining if there are any matching rows at all in the database. It is trivially easy to something much better, whether by using a LIMIT 1 and $num_rows, or some form of COUNT(), or perhaps some other approach.

PHP-PDS: Interview on Voices of the ElePHPant

My good friend Cal Evans interviewed me about the PHP-PDS initiative and its first offering, pds/skeleton last week; here it is for you to enjoy.

(Via https://voicesoftheelephpant.com/2016/12/20/interview-paul-m-jones/.)

While we’re here, you might want to check out his virtual user group Nomad PHP, and the series of one-day virtual training conferences at DayCamp4Developers, because you’ll probably learn something useful there.

PECL Request Extension: Beta 1 Released!

I am happy to announce that the PECL extension for server-side request and response objects has reached beta status! (Documentation here.)

This release adds four new read-only properties to the ServerRequest object:

  • $forwarded is an array representation of $_SERVER['HTTP_FORWARDED'] (cf. the Forwarded HTTP Extension),

  • $forwardedFor is an array computed from treating $_SERVER['HTTP_X_FORWARDED_FOR'] as comma-separated values.

  • $forwardedHost is the $_SERVER['HTTP_X_FORWARDED_HOST'] value, and

  • $forwardedProto is $_SERVER['HTTP_X_FORWARDED_PROTO'] value.

If you use trusted proxies a lot in your work, you can use those values to compute the “real” client IP address in your application.

This completes the intended initial functionality of the extension. You should install it and try it out, because it might be make your work easier.

Package Development Standards: “pds/skeleton” Now Open For Review!

The new Package Development Standards initiative is proud to present its first publication, pds/skeleton (and the related research) for public review. If you are a package author, you are invited to post your comments and criticisms of the publication as issues on the relevant Github repository.

The pds/skeleton publication describes a set of standard top-level PHP package directories and files. If you are an author of more than three packages on Packagist, chances are you already follow the standard! That’s because PDS initiative researches the PHP package ecosystem to recognize commonly adopted development practices. (See this list of over than 47,000 packages; if your package is there, it has been tentatively noted as already compliant.)

The public review period will last at least 2 weeks, perhaps longer. If there are no substantial errors revealed during that time, the standard will advance to “beta” status for adoption, and then to “stable” if no further errors are discovered in use.

Meanwhile, if you review the publication and determine that your package follows it, you should add pds/skeleton to your Composer “require-dev” block, because that will help the initiative track formal adoption rates.

PHP file_get_contents() HTTPS/SSL error on Mac OS

Problem on Mac OS Sierra (10.12.x): Using file_get_contents() for an https resource fails with error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed.

This solution appears to be legit, although I am not primarily a security guy, so real security folks should chime in if they seem something untoward:

% sudo mkdir -p /usr/local/libressl/etc/ssl/certs
% sudo curl -o /usr/local/libressl/etc/ssl/cert.pem https://curl.haxx.se/ca/cacert.pem

This creates the default_cert_file directories and imports the latest certificate store from curl.haxx.se.

Source: PHP on macOS Sierra can’t access SSL data – andrewyager.com

Independent Packages and Subtree Splits

You’ll sometimes see a PHP package hosted in a Github repository with the heading or subtitle “[READ ONLY] Subtree Split”. This indicates that the package is actually copied from another codebase (usually a framework) and is not intended to be worked on separately from that other codebase.

As such, a “subtree split” is not necessarily a sign of an independent package. Using a subtree split to publish a package says to me that the authors’ concentration is on the framework-as-a-whole, not on the package in-and-of- itself.

For example, Symfony does subtree splits for its components, as does Laravel for its Illuminate components. Those packages from the frameworks are not developed independently; they only move forward as part of the whole framework.

In these cases, you often end up with composer.json in the framework origin directories, which is not something I generally expect. Further, the framework subdirectories may have their own src/tests/docs/etc. directories. They are there so that the subtree split can have them available at their own top level, but in the origin framework, it is again something I find unexpected.

I say: if you’re going to advertise independent packages, actually write them independently. Let them be their own thing. Aura has done it that way since its beginning, and Zend Framework converted to that approach in version 3. Then you can compose the truly independent packages into a framework, instead of subtree-splitting your framework into pseudo-independent packages that are still bound to the origin framework development and release process.

PHP ssh2.sftp opendir/readdir fix

This bug https://bugs.php.net/bug.php?id=73597 related to the PECL ssh2 extension bit us yesterday, so this post is a public service announcement that will (hopefully) save you from writing your own workaround like I almost did.

Problem: PHP 5.6.28 (and apparently 7.0.13) introduced a security fix to URL parsing, that caused the string interpolation of the $sftp resource handle to no-longer be recognized as a valid URL. In turn, that causes opendir(), readdir(), etc. to fail when you use an $sftp resource in the path string, after an upgrade to one of those PHP versions.

Solution: Instead of using "ssh2.sftp://$sftp" as a stream path, convert $sftp to an integer like so: "ssh2.sftp://" . intval($sftp) . "/". Then it will work just fine.

Thanks to the people who fixed the URL parsing security flaw, thanks to the people who wrote the PECL ssh2 extension, and thanks to the people who provided the fix.

Conserving On The Wrong Resource

(This is a blog post I’ve had in a “write-me!” folder for years; I am now publishing it in its unfinished form to get it off my mind.)

Programmers are acutely aware of their limited resources: CPU, memory, storage, requests-per-second, screen space, line length, and so on. If programmers are aware of a resource constraint, they conserve on it; sometimes they call this “optimizing.”

Are you aware of a loop? Then you try to “optimize” it, or conserve on it, e.g. by not using a count() in the loop declaration.

Are you aware of the number of lines? You might try to conserve on the number of lines, perhaps by removing blank lines (especially in docblocks).

Are you aware of aware of the size of an array, or the number of objects in the system, and so on? You try to conserve on those, to reduce their number.

But this is sometimes a trap, because you conserve or optimize on a low value
resource you are aware of, at the cost of not conserving on a high-value resource you are not aware of. For example, optimizing a for loop by not using count() in its declaration will have little impact if the loop is executing queries against an SQL database.

Lesson: conserve on high-value resources that you may not be aware of, not low-value ones that are obvious but of little consequence.