Efficient use of mysqli_result::$num_rows

I frequently see this pattern in legacy applications using mysqli:

$result = $mysqli->query("
    SELECT *
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->num_rows > 0) {
    return;
}

The developer’s intent here is to see if there are any rows at all in the database that match a certain condition. He does so by issuing a query, then asking the result object how many rows it has. The developer doesn’t actually want any data from the result, and doesn’t care about the actual row-count itself; this is just a check to see if at least one row exists in the database.

This is a poor conservation of resources. The database does the work needed to select all the columns for all the rows matching the conditions, allocates memory for them, and returns them. But the developer discards all that immediately.

To accomplish the same end, it is less resource-intensive and just as effective to query for a single column and limit the results to a single row:

$result = $mysqli->query("
    SELECT col_name
    FROM table_name
    WHERE conditions = 'whatever'
    LIMIT 1
");
if ($result && $result->num_rows > 0) {
    return;
}

Now the database only does the work needed for a single column and a single row.

(As a side note, I find it interesting that I have not seen this pattern at all in projects using PDO. I’m not sure why this would be. Perhaps there is some originating example code for mysqli somewhere that has gained a life of its own through copying and reuse.)

UPDATE: Perhaps a better way to conserve resources, courtesy of Reddit user marcjschmidt, is to use a COUNT() in the query, then fetch the count of rows, something more like this …

$result = $mysqli->query("
    SELECT COUNT(*)
    FROM table_name
    WHERE conditions = 'whatever'
");
if ($result && $result->fetch_array()[0] > 0) {
    return;
}

… thereby avoiding the use of mysqli_result::$num_rows completely.

UPDATE 2: Lots of commentary in the above Reddit thread. To summarize this blog post: selecting all columns of all rows, then examining $num_rows, and then discarding the result set, is a terrible way of determining if there are any matching rows at all in the database. It is trivially easy to something much better, whether by using a LIMIT 1 and $num_rows, or some form of COUNT(), or perhaps some other approach.

PHP-PDS: Interview on Voices of the ElePHPant

My good friend Cal Evans interviewed me about the PHP-PDS initiative and its first offering, pds/skeleton last week; here it is for you to enjoy.

(Via https://voicesoftheelephpant.com/2016/12/20/interview-paul-m-jones/.)

While we’re here, you might want to check out his virtual user group Nomad PHP, and the series of one-day virtual training conferences at DayCamp4Developers, because you’ll probably learn something useful there.

PECL Request Extension: Beta 1 Released!

I am happy to announce that the PECL extension for server-side request and response objects has reached beta status! (Documentation here.)

This release adds four new read-only properties to the ServerRequest object:

  • $forwarded is an array representation of $_SERVER['HTTP_FORWARDED'] (cf. the Forwarded HTTP Extension),

  • $forwardedFor is an array computed from treating $_SERVER['HTTP_X_FORWARDED_FOR'] as comma-separated values.

  • $forwardedHost is the $_SERVER['HTTP_X_FORWARDED_HOST'] value, and

  • $forwardedProto is $_SERVER['HTTP_X_FORWARDED_PROTO'] value.

If you use trusted proxies a lot in your work, you can use those values to compute the “real” client IP address in your application.

This completes the intended initial functionality of the extension. You should install it and try it out, because it might be make your work easier.

Package Development Standards: “pds/skeleton” Now Open For Review!

The new Package Development Standards initiative is proud to present its first publication, pds/skeleton (and the related research) for public review. If you are a package author, you are invited to post your comments and criticisms of the publication as issues on the relevant Github repository.

The pds/skeleton publication describes a set of standard top-level PHP package directories and files. If you are an author of more than three packages on Packagist, chances are you already follow the standard! That’s because PDS initiative researches the PHP package ecosystem to recognize commonly adopted development practices. (See this list of over than 47,000 packages; if your package is there, it has been tentatively noted as already compliant.)

The public review period will last at least 2 weeks, perhaps longer. If there are no substantial errors revealed during that time, the standard will advance to “beta” status for adoption, and then to “stable” if no further errors are discovered in use.

Meanwhile, if you review the publication and determine that your package follows it, you should add pds/skeleton to your Composer “require-dev” block, because that will help the initiative track formal adoption rates.

The Iron Law of (Open Source) Bureaucracy

Something to watch out for in open-source projects: Pournelle’s “Iron Law of Bureaucracy.” In the organization around your project, you will find two kinds of people:

  1. Those who concentrate on the goals of the organization. These are generally coders, devops, and infrastructure folks.
  2. Those who concentrate on the organization itself. These are generally managers of people within the organization, and controllers of how & when the organization communicates within itself and outside itself.

Pournelle’s Iron Law asserts that group 2 will always end up in control of the organization.

Police your membership accordingly.

PHP file_get_contents() HTTPS/SSL error on Mac OS

Problem on Mac OS Sierra (10.12.x): Using file_get_contents() for an https resource fails with error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed.

This solution appears to be legit, although I am not primarily a security guy, so real security folks should chime in if they seem something untoward:

% sudo mkdir -p /usr/local/libressl/etc/ssl/certs
% sudo curl -o /usr/local/libressl/etc/ssl/cert.pem https://curl.haxx.se/ca/cacert.pem

This creates the default_cert_file directories and imports the latest certificate store from curl.haxx.se.

Source: PHP on macOS Sierra can’t access SSL data – andrewyager.com

Independent Packages and Subtree Splits

You’ll sometimes see a PHP package hosted in a Github repository with the heading or subtitle “[READ ONLY] Subtree Split”. This indicates that the package is actually copied from another codebase (usually a framework) and is not intended to be worked on separately from that other codebase.

As such, a “subtree split” is not necessarily a sign of an independent package. Using a subtree split to publish a package says to me that the authors’ concentration is on the framework-as-a-whole, not on the package in-and-of- itself.

For example, Symfony does subtree splits for its components, as does Laravel for its Illuminate components. Those packages from the frameworks are not developed independently; they only move forward as part of the whole framework.

In these cases, you often end up with composer.json in the framework origin directories, which is not something I generally expect. Further, the framework subdirectories may have their own src/tests/docs/etc. directories. They are there so that the subtree split can have them available at their own top level, but in the origin framework, it is again something I find unexpected.

I say: if you’re going to advertise independent packages, actually write them independently. Let them be their own thing. Aura has done it that way since its beginning, and Zend Framework converted to that approach in version 3. Then you can compose the truly independent packages into a framework, instead of subtree-splitting your framework into pseudo-independent packages that are still bound to the origin framework development and release process.

PHP ssh2.sftp opendir/readdir fix

This bug https://bugs.php.net/bug.php?id=73597 related to the PECL ssh2 extension bit us yesterday, so this post is a public service announcement that will (hopefully) save you from writing your own workaround like I almost did.

Problem: PHP 5.6.28 (and apparently 7.0.13) introduced a security fix to URL parsing, that caused the string interpolation of the $sftp resource handle to no-longer be recognized as a valid URL. In turn, that causes opendir(), readdir(), etc. to fail when you use an $sftp resource in the path string, after an upgrade to one of those PHP versions.

Solution: Instead of using "ssh2.sftp://$sftp" as a stream path, convert $sftp to an integer like so: "ssh2.sftp://" . intval($sftp) . "/". Then it will work just fine.

Thanks to the people who fixed the URL parsing security flaw, thanks to the people who wrote the PECL ssh2 extension, and thanks to the people who provided the fix.

Conserving On The Wrong Resource

(This is a blog post I’ve had in a “write-me!” folder for years; I am now publishing it in its unfinished form to get it off my mind.)

Programmers are acutely aware of their limited resources: CPU, memory, storage, requests-per-second, screen space, line length, and so on. If programmers are aware of a resource constraint, they conserve on it; sometimes they call this “optimizing.”

Are you aware of a loop? Then you try to “optimize” it, or conserve on it, e.g. by not using a count() in the loop declaration.

Are you aware of the number of lines? You might try to conserve on the number of lines, perhaps by removing blank lines (especially in docblocks).

Are you aware of aware of the size of an array, or the number of objects in the system, and so on? You try to conserve on those, to reduce their number.

But this is sometimes a trap, because you conserve or optimize on a low value
resource you are aware of, at the cost of not conserving on a high-value resource you are not aware of. For example, optimizing a for loop by not using count() in its declaration will have little impact if the loop is executing queries against an SQL database.

Lesson: conserve on high-value resources that you may not be aware of, not low-value ones that are obvious but of little consequence.

The PHP 7 “Request” Extension

tl;dr: The new request extension provides server-side request and response objects for PHP 7. Use it as a replacement object for request superglobals and response functions. An equivalent PHP 5 version is available in userland as pmjones/request.


You’re tired of dealing with the $_GET, $_POST, etc. superglobals in your PHP 7 application. You wish $_FILES was easer to deal with. You’d prefer to wrap them all in an object to pass around to your class methods, so they’d be easier to test. And as long as they’re all in an object, it might be nice to have parsed representations of the various incoming headers. Preferably, you’d like for it to be read-only, so that the various libraries you use can’t modify superglobal state out from under you.

Likewise, seeing the “Cannot modify header information – headers already sent” warning is getting on your nerves. You’d like to get away from using header(), setcookie(), and the rest. You know they’re hard to inspect, and they’re hard to test. What would be great is to have all that response work wrapped in another object that you can pass around, and inspect or modify it before sending the response back to the client.

You could maybe adopt a framework, but why do that for your custom project? Just a pair of server-side request and response objects would make your life so much easer. Why can’t there be set of internal PHP classes for that?

Well, now there is. You can install the request extension from John Boehr and myself to get ServerRequest and ServerReponse objects as if PHP itself provided them.

ServerRequest

After you install the extension, you can issue $request = new ServerRequest(), and then:

Instead of ...                          ... use:
--------------------------------------- ---------------------------------------
$_COOKIE                                $request->cookie
$_ENV                                   $request->env
$_GET                                   $request->get
$_FILES                                 $request->files
$_POST                                  $request->post
$_SERVER                                $request->server
$_SERVER['REQUEST_METHOD']              $request->method
$_SERVER['HTTP_HEADER_NAME']            $request->headers['header-name']
file_get_contents('php://input')        $request->content
$_SERVER['HTTP_CONTENT_LENGTH']         $request->contentLength
$_SERVER['HTTP_CONTENT_MD5']            $request->contentMd5
$_SERVER['PHP_AUTH_PW']                 $request->authPw
$_SERVER['PHP_AUTH_TYPE']               $request->authType
$_SERVER['PHP_AUTH_USER']               $request->authUser

Likewise:

Instead of parsing ...                  ... use:
--------------------------------------- ---------------------------------------
isset($_SERVER['key'])                  $request->server['key'] ?? 'default'
  ? $_SERVER['key']                     (good for all superglobals)
  : 'default';
$_FILES to look more like $_POST        $request->uploads
$_SERVER['HTTP_CONTENT_TYPE']           $request->contentType and
                                        $request->contentCharset
$_SERVER['HTTP_ACCEPT']                 $request->accept
$_SERVER['HTTP_ACCEPT_CHARSET']         $request->acceptCharset
$_SERVER['HTTP_ACCEPT_ENCODING']        $request->acceptEncoding
$_SERVER['HTTP_ACCEPT_LANGUAGE']        $request->acceptLanguage
$_SERVER['HTTP_X_HTTP_METHOD_OVERRIDE'] $request->method
$_SERVER['PHP_AUTH_DIGEST']             $request->authDigest
$_SERVER['HTTP_X_REQUESTED_WITH']       $request->xhr
  == 'XmlHttpRequest'

Those properties are all read-only, so there’s no chance of them being changed without you knowing. (There are some true immutables for application-related values as well; see the documentation.)

ServerResponse

For responses, you can issue $response = new ServerResponse(), and then:

Instead of ...                          ... use:
--------------------------------------- ---------------------------------------
header('Foo: bar', true);               $response->setHeader('Foo', 'bar');
header('Foo: bar', false);              $response->addHeader('Foo', 'bar');
setcookie('foo', 'bar');                $response->setCookie('foo', 'bar');
setrawcookie('foo', 'bar');             $response->setRawCookie('foo', 'bar');
echo $content;                          $response->setContent($content);

You can inspect the $response object, and then call $response->send() to
send it to the client.

Working with JSON?

// instead of ...
header('Content-Type: application/json')
echo json_encode($value);

// .. use:
$response->setContentJson($value);

Sending a $file for download?

// instead of ...
header('Content-Type: application/octet-stream');
header('Content-Transfer-Encoding: binary');
header(
    'Content-Disposition: attachment;filename="'
    . rawurlencode(basename($file)) . '"'
);
$fh = fopen($file, 'rb+');
fpasthru($fh);

// use:
$fh = fopen($file, 'rb+');
$response->setContentDownload($fh, basename($file));

Building a complex header? Pass an array instead of a string:

$response->setHeader('Cache-Control', [
    'public',
    'max-age' => '123',
    's-maxage' => '456',
    'no-cache',
]); // Cache-Control: public, max-age=123, s-maxage=456, no-cache

$response->setHeader('X-Whatever', [
    'foo',
    'bar' => [
        'baz' => 'dib',
        'zim',
        'gir' => 'irk',
    ],
    'qux' => 'quux',
]); // X-Whatever: foo, bar;baz=dib;zim;gir=irk, qux=quux

Find Out More

You can read the rest of the documentation at https://gitlab.com/pmjones/ext-request to discover more convenient functionality. And if you’re stuck on PHP 5.x for now, the extension has a userland version installable via Composer as pmjones/request.

Try out the request extension today, because a pair of server-side request and response objects will make your life a lot easier.