PHP-PDS: Interview on Voices of the ElePHPant

My good friend Cal Evans interviewed me about the PHP-PDS initiative and its first offering, pds/skeleton last week; here it is for you to enjoy.

(Via https://voicesoftheelephpant.com/2016/12/20/interview-paul-m-jones/.)

While we’re here, you might want to check out his virtual user group Nomad PHP, and the series of one-day virtual training conferences at DayCamp4Developers, because you’ll probably learn something useful there.

PECL Request Extension: Beta 1 Released!

I am happy to announce that the PECL extension for server-side request and response objects has reached beta status! (Documentation here.)

This release adds four new read-only properties to the ServerRequest object:

  • $forwarded is an array representation of $_SERVER['HTTP_FORWARDED'] (cf. the Forwarded HTTP Extension),

  • $forwardedFor is an array computed from treating $_SERVER['HTTP_X_FORWARDED_FOR'] as comma-separated values.

  • $forwardedHost is the $_SERVER['HTTP_X_FORWARDED_HOST'] value, and

  • $forwardedProto is $_SERVER['HTTP_X_FORWARDED_PROTO'] value.

If you use trusted proxies a lot in your work, you can use those values to compute the “real” client IP address in your application.

This completes the intended initial functionality of the extension. You should install it and try it out, because it might be make your work easier.

Package Development Standards: “pds/skeleton” Now Open For Review!

The new Package Development Standards initiative is proud to present its first publication, pds/skeleton (and the related research) for public review. If you are a package author, you are invited to post your comments and criticisms of the publication as issues on the relevant Github repository.

The pds/skeleton publication describes a set of standard top-level PHP package directories and files. If you are an author of more than three packages on Packagist, chances are you already follow the standard! That’s because PDS initiative researches the PHP package ecosystem to recognize commonly adopted development practices. (See this list of over than 47,000 packages; if your package is there, it has been tentatively noted as already compliant.)

The public review period will last at least 2 weeks, perhaps longer. If there are no substantial errors revealed during that time, the standard will advance to “beta” status for adoption, and then to “stable” if no further errors are discovered in use.

Meanwhile, if you review the publication and determine that your package follows it, you should add pds/skeleton to your Composer “require-dev” block, because that will help the initiative track formal adoption rates.

The Iron Law of (Open Source) Bureaucracy

Something to watch out for in open-source projects: Pournelle’s “Iron Law of Bureaucracy.” In the organization around your project, you will find two kinds of people:

  1. Those who concentrate on the goals of the organization. These are generally coders, devops, and infrastructure folks.
  2. Those who concentrate on the organization itself. These are generally managers of people within the organization, and controllers of how & when the organization communicates within itself and outside itself.

Pournelle’s Iron Law asserts that group 2 will always end up in control of the organization.

Police your membership accordingly.

PHP file_get_contents() HTTPS/SSL error on Mac OS

Problem on Mac OS Sierra (10.12.x): Using file_get_contents() for an https resource fails with error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed.

This solution appears to be legit, although I am not primarily a security guy, so real security folks should chime in if they seem something untoward:

% sudo mkdir -p /usr/local/libressl/etc/ssl/certs
% sudo curl -o /usr/local/libressl/etc/ssl/cert.pem https://curl.haxx.se/ca/cacert.pem

This creates the default_cert_file directories and imports the latest certificate store from curl.haxx.se.

Source: PHP on macOS Sierra can’t access SSL data – andrewyager.com

Independent Packages and Subtree Splits

You’ll sometimes see a PHP package hosted in a Github repository with the heading or subtitle “[READ ONLY] Subtree Split”. This indicates that the package is actually copied from another codebase (usually a framework) and is not intended to be worked on separately from that other codebase.

As such, a “subtree split” is not necessarily a sign of an independent package. Using a subtree split to publish a package says to me that the authors’ concentration is on the framework-as-a-whole, not on the package in-and-of- itself.

For example, Symfony does subtree splits for its components, as does Laravel for its Illuminate components. Those packages from the frameworks are not developed independently; they only move forward as part of the whole framework.

In these cases, you often end up with composer.json in the framework origin directories, which is not something I generally expect. Further, the framework subdirectories may have their own src/tests/docs/etc. directories. They are there so that the subtree split can have them available at their own top level, but in the origin framework, it is again something I find unexpected.

I say: if you’re going to advertise independent packages, actually write them independently. Let them be their own thing. Aura has done it that way since its beginning, and Zend Framework converted to that approach in version 3. Then you can compose the truly independent packages into a framework, instead of subtree-splitting your framework into pseudo-independent packages that are still bound to the origin framework development and release process.

PHP ssh2.sftp opendir/readdir fix

This bug https://bugs.php.net/bug.php?id=73597 related to the PECL ssh2 extension bit us yesterday, so this post is a public service announcement that will (hopefully) save you from writing your own workaround like I almost did.

Problem: PHP 5.6.28 (and apparently 7.0.13) introduced a security fix to URL parsing, that caused the string interpolation of the $sftp resource handle to no-longer be recognized as a valid URL. In turn, that causes opendir(), readdir(), etc. to fail when you use an $sftp resource in the path string, after an upgrade to one of those PHP versions.

Solution: Instead of using "ssh2.sftp://$sftp" as a stream path, convert $sftp to an integer like so: "ssh2.sftp://" . intval($sftp) . "/". Then it will work just fine.

Thanks to the people who fixed the URL parsing security flaw, thanks to the people who wrote the PECL ssh2 extension, and thanks to the people who provided the fix.

Conserving On The Wrong Resource

(This is a blog post I’ve had in a “write-me!” folder for years; I am now publishing it in its unfinished form to get it off my mind.)

Programmers are acutely aware of their limited resources: CPU, memory, storage, requests-per-second, screen space, line length, and so on. If programmers are aware of a resource constraint, they conserve on it; sometimes they call this “optimizing.”

Are you aware of a loop? Then you try to “optimize” it, or conserve on it, e.g. by not using a count() in the loop declaration.

Are you aware of the number of lines? You might try to conserve on the number of lines, perhaps by removing blank lines (especially in docblocks).

Are you aware of aware of the size of an array, or the number of objects in the system, and so on? You try to conserve on those, to reduce their number.

But this is sometimes a trap, because you conserve or optimize on a low value
resource you are aware of, at the cost of not conserving on a high-value resource you are not aware of. For example, optimizing a for loop by not using count() in its declaration will have little impact if the loop is executing queries against an SQL database.

Lesson: conserve on high-value resources that you may not be aware of, not low-value ones that are obvious but of little consequence.

The PHP 7 “Request” Extension

tl;dr: The new request extension provides server-side request and response objects for PHP 7. Use it as a replacement object for request superglobals and response functions. An equivalent PHP 5 version is available in userland as pmjones/request.


You’re tired of dealing with the $_GET, $_POST, etc. superglobals in your PHP 7 application. You wish $_FILES was easer to deal with. You’d prefer to wrap them all in an object to pass around to your class methods, so they’d be easier to test. And as long as they’re all in an object, it might be nice to have parsed representations of the various incoming headers. Preferably, you’d like for it to be read-only, so that the various libraries you use can’t modify superglobal state out from under you.

Likewise, seeing the “Cannot modify header information – headers already sent” warning is getting on your nerves. You’d like to get away from using header(), setcookie(), and the rest. You know they’re hard to inspect, and they’re hard to test. What would be great is to have all that response work wrapped in another object that you can pass around, and inspect or modify it before sending the response back to the client.

You could maybe adopt a framework, but why do that for your custom project? Just a pair of server-side request and response objects would make your life so much easer. Why can’t there be set of internal PHP classes for that?

Well, now there is. You can install the request extension from John Boehr and myself to get ServerRequest and ServerReponse objects as if PHP itself provided them.

ServerRequest

After you install the extension, you can issue $request = new ServerRequest(), and then:

Instead of ...                          ... use:
--------------------------------------- ---------------------------------------
$_COOKIE                                $request->cookie
$_ENV                                   $request->env
$_GET                                   $request->get
$_FILES                                 $request->files
$_POST                                  $request->post
$_SERVER                                $request->server
$_SERVER['REQUEST_METHOD']              $request->method
$_SERVER['HTTP_HEADER_NAME']            $request->headers['header-name']
file_get_contents('php://input')        $request->content
$_SERVER['HTTP_CONTENT_LENGTH']         $request->contentLength
$_SERVER['HTTP_CONTENT_MD5']            $request->contentMd5
$_SERVER['PHP_AUTH_PW']                 $request->authPw
$_SERVER['PHP_AUTH_TYPE']               $request->authType
$_SERVER['PHP_AUTH_USER']               $request->authUser

Likewise:

Instead of parsing ...                  ... use:
--------------------------------------- ---------------------------------------
isset($_SERVER['key'])                  $request->server['key'] ?? 'default'
  ? $_SERVER['key']                     (good for all superglobals)
  : 'default';
$_FILES to look more like $_POST        $request->uploads
$_SERVER['HTTP_CONTENT_TYPE']           $request->contentType and
                                        $request->contentCharset
$_SERVER['HTTP_ACCEPT']                 $request->accept
$_SERVER['HTTP_ACCEPT_CHARSET']         $request->acceptCharset
$_SERVER['HTTP_ACCEPT_ENCODING']        $request->acceptEncoding
$_SERVER['HTTP_ACCEPT_LANGUAGE']        $request->acceptLanguage
$_SERVER['HTTP_X_HTTP_METHOD_OVERRIDE'] $request->method
$_SERVER['PHP_AUTH_DIGEST']             $request->authDigest
$_SERVER['HTTP_X_REQUESTED_WITH']       $request->xhr
  == 'XmlHttpRequest'

Those properties are all read-only, so there’s no chance of them being changed without you knowing. (There are some true immutables for application-related values as well; see the documentation.)

ServerResponse

For responses, you can issue $response = new ServerResponse(), and then:

Instead of ...                          ... use:
--------------------------------------- ---------------------------------------
header('Foo: bar', true);               $response->setHeader('Foo', 'bar');
header('Foo: bar', false);              $response->addHeader('Foo', 'bar');
setcookie('foo', 'bar');                $response->setCookie('foo', 'bar');
setrawcookie('foo', 'bar');             $response->setRawCookie('foo', 'bar');
echo $content;                          $response->setContent($content);

You can inspect the $response object, and then call $response->send() to
send it to the client.

Working with JSON?

// instead of ...
header('Content-Type: application/json')
echo json_encode($value);

// .. use:
$response->setContentJson($value);

Sending a $file for download?

// instead of ...
header('Content-Type: application/octet-stream');
header('Content-Transfer-Encoding: binary');
header(
    'Content-Disposition: attachment;filename="'
    . rawurlencode(basename($file)) . '"'
);
$fh = fopen($file, 'rb+');
fpasthru($fh);

// use:
$fh = fopen($file, 'rb+');
$response->setContentDownload($fh, basename($file));

Building a complex header? Pass an array instead of a string:

$response->setHeader('Cache-Control', [
    'public',
    'max-age' => '123',
    's-maxage' => '456',
    'no-cache',
]); // Cache-Control: public, max-age=123, s-maxage=456, no-cache

$response->setHeader('X-Whatever', [
    'foo',
    'bar' => [
        'baz' => 'dib',
        'zim',
        'gir' => 'irk',
    ],
    'qux' => 'quux',
]); // X-Whatever: foo, bar;baz=dib;zim;gir=irk, qux=quux

Find Out More

You can read the rest of the documentation at https://gitlab.com/pmjones/ext-request to discover more convenient functionality. And if you’re stuck on PHP 5.x for now, the extension has a userland version installable via Composer as pmjones/request.

Try out the request extension today, because a pair of server-side request and response objects will make your life a lot easier.

Avoiding Quasi-Immutable Objects in PHP

tl;dr: Immutability in PHP is most practical when the object properties are scalars or nulls. Using streams, objects, or arrays as properties makes it very difficult, sometimes impossible, to preserve immutablity.


One of the tactics in Domain Driven Design is to use Value Objects. A Value Object has no identifier attached to it; only the combination of the values of its properties gives it any identification. If you change any of the properties in any way, the modification must return an entirely new instance of the Value Object.

This kind of behavior means the Value Object is “immutable.” That is, the particular instance is not allowed to change, though you can get back a new instance with modified values. The code for an immutable object looks something like this:

<?php
class ImmutableFoo
{
    protected $bar;

    public function __construct($bar)
    {
        $this->bar = $bar;
    }

    public function getBar()
    {
        return $this->bar;
    }

    public function withBar($newBar)
    {
        $clone = clone $this;
        $clone->bar = $newBar;
        return $clone;
    }
}
?>

(Note how $bar is accessible only through a method, not as a public property.)

When you create an ImmutableFoo instance, you cannot change the value of $bar after instantiation. Instead, you can only get back a new instance with the new value of $bar by calling withBar():

<?php
$foo = new ImmutableFoo('a');
$newFoo = $foo->withBar('b');

echo $foo->getBar(); // 'a'
echo $newFoo->getBar(); // 'b'
var_dump($foo === $newFoo); // (bool) false
?>

With this approach, you are guaranteed that one place in the code cannot change the $foo object at a distance from any other place in the code. Anything that ever gets that instance of $foo knows that its properties will always be the same no matter what.

The immutability approach can be powerful in Domain Driven Design and elsewhere. It works very easily in PHP with scalar values and nulls. That’s because PHP returns those by copy, not by reference.

However, enforcing immutability in PHP is difficult when the immutable object properties are non-scalar (i.e., when they are streams, objects, or arrays). With non-scalars, your object might seem immutable at first, but mutablity reveals itself later. These objects will be “quasi-“, not truly, immutable.

Streams as Immutable Properties

If a stream or similar resource has been opened in a writable (or appendable) mode, and is used as an immutable property, it should be obvious that object immutability is not preserved. For example:

<?php
file_put_contents('/tmp/bar.txt', 'baz');

$foo = new ImmutableFoo(fopen('/tmp/bar.txt', 'w+'));
$bar = $foo->getBar();
fpassthru($bar); // 'baz'

rewind($bar);
fwrite($bar, 'dib');
rewind($bar);

fpassthru($foo->getBar()); // 'dib'
?>

As you can see, the effective property value has changed, meaning immutability has been compromised.

One way around this might be to make sure that immutable objects themselves check that stream resources are always-and-only in read-only mode. However, even that is not a certain solution, because the resource pointer might be moved by reading operations in different parts of the application code. In turn, that means reading from the stream may yield different results at different times, making the value appear mutable.

As such, it appears that only “read-only” streams can be used as immutable properties, and then only if the immutable object restores the stream, its pointers, and all of its meta-data to their initial state every time the stream is accessed.

Objects as Immutable Properties

Because PHP returns objects as references, rather than as copies, using an object as a property value compromises the immutability of the parent object. For example:

$foo = new ImmutableFoo((object) ['baz' => 'dib']);
$bar = $foo->getBar();
echo $bar->baz; // 'dib'

$bar->baz = 'zim';
echo $foo->getBar()->baz; // 'zim'

As you can see, the value of $bar has changed in the $foo instance. Any other code using $foo will see those changes as well. This means immutability has not been preserved.

One way around this is to make sure that all objects used as immutable properties are themselves immutable.

Another way around this is to make sure that getter methods clone any object properties they return. However, it will have to be a recursively deep clone, covering all of the cloned object’s properties (and all of their properties, etc.). That’s to make sure that all object properties down the line are also cloned; otherwise, immutability is again compromised at some point.

Arrays as Immutable Properties

Unlike objects, PHP returns arrays as copies by default. However, if an immutable object property is an array, mutable objects in that array compromise the parent object’s immutability. For example:

$foo = new ImmutableFoo([
    0 => (object) ['baz' => 'dib'],
]);

$bar = $foo->getBar();
echo $bar[0]->baz;

$bar[0]->baz = 'zim';
echo $foo->getBar()[0]->baz; // 'zim'

Because the array holds an object, and because PHP returns objects by reference, the contents of the array have now changed. That means $foo has effectively changed as well. Again, immutability has not been preserved.

Likewise, if the array holds a reference to a stream resource, we see the problems described about streams above.

The only way around this is for the immutable object to recursively scan through array properties to make sure that they contain only immutable values. This is probably not practical in most situations, which means that arrays are probably not suitable as immutable values.

Settable Undefined Public Properties

Finally, PHP allows you to set values on undefined properties, as if they were public. This means it is possible to add mutable properties to an immutable object:

$foo = new ImmutableFoo('bar');

// there is no $zim property, so PHP
// creates it as if it were public

$foo->zim = 'gir';
echo $foo->zim; // 'gir'

$foo->zim = 'irk';
echo $foo->zim; // 'irk'

Immutability of the object is once again compromised. The only way around this is to impelement __set() to prevent setting of undefined properties.

Further, it might be wise to implement __unset() to warn that properties cannot be unset.

Conclusion

If you want to build a truly immutable object in PHP, it appears the best approach is the following:

  • Default to using only scalars and nulls as properties.
  • Avoid streams as properties; if a property must be a stream, make sure that it is read-only, and that its state is restored each time it is used.
  • Avoid objects as properties; if a property must be an object, make sure that object is itself immutable.
  • Especially avoid arrays as properties; use only with extreme caution and care.
  • Implement __set() to disallow setting of undefined properties.
  • Possibly implement __unset() to warn that the object is immutable.

Overall, it seems like immutability is easiest with only scalars and nulls. Anything else, and you have a lot more opportunity for error.

Remember, though, there’s nothing wrong with fully- or partially-mutable objects, as long as they are advertised as such. What you want to avoid are quasi-immutable objects: ones that advertise, but do not deliver, true immutability.

(For some further reading, check out At What Point Do Immutable Classes Become A Burden?)

Update: