Memory Leaks With Objects in PHP 5

Wow, this entry is getting a lot of traffic from DZone and other places. If you like this post, be sure to check out New Year's Benchmarks for PHP framework comparisons, and the Solar Framework for PHP 5. Thanks for visiting!

One of the nice things about using a scripting language is that it automates garbage collection for you. You don't have to worry about releasing memory when you're done with your variables. As each variable passes out of scope, PHP frees that memory for you. If you want to, you can free the memory yourself using unset(), but usually you don't have to.

But there is at least one circumstance in which PHP will not free memory for you when you call unset(). Cf. http://bugs.php.net/bug.php?id=33595.

Problem

If you have two objects in circular reference, such as in a parent-child relationship, calling unset() on the parent object will not free the memory used for the parent reference in the child object. (Nor will the memory be freed when the parent object is garbage-collected.)

Confusing? Here's a script you can run to see it in action:

<?php
class Foo {
    function __construct()
    {
        $this->bar = new Bar($this);
    }
}

class Bar {
    function __construct($foo = null)
    {
        $this->foo = $foo;
    }
}

while (true) {
    $foo = new Foo();
    unset($foo);
    echo number_format(memory_get_usage()) . "n";
}
?>

Run that and watch your memory usage climb and climb, until memory runs out.

...
33,551,616
33,551,976
33,552,336
33,552,696
PHP Fatal error:  Allowed memory size of 33554432 bytes exhausted
(tried to allocate 16 bytes) in memleak.php on line 17

For most PHP developers this behavior is not likely to be a problem. However, if you use a lot of objects in parent-child relationships over a long-running script, you can run out of memory pretty quickly, especially if those objects are relatively large. I discovered this myself while testing some ORM-related pieces for Solar and it took me a couple days to figure it out -- hence this blog post. ;-)

Userland Solution

The bugs.php.net link above presents a solution, albeit inelegant and tedious. The "fix" is to call a destructor method before unsetting the object. The destructor method should clear out any internal parent object references, which will free the memory that would otherwise leak.

The "fixed" script looks like this:

<?php
class Foo {
    function __construct()
    {
        $this->bar = new Bar($this);
    }
    function __destruct()
    {
        unset($this->bar);
    }
}

class Bar {
    function __construct($foo = null)
    {
        $this->foo = $foo;
    }
}

while (true) {
    $foo = new Foo();
    $foo->__destruct();
    unset($foo);
    echo number_format(memory_get_usage()) . "n";
}
?>

Note the new Foo::__destruct() method and the call to $foo->__destruct() before unsetting. Now the script will run forever, showing you the amount of memory it uses (which never gets any bigger, thank goodness).

PHP Internals Solution?

Why does the memory leak occur? I am not wise in the ways of PHP internals, but it has something to do with reference counts. The refcount for the child $foo reference inside $bar does not get decremented when the parent $foo is unset, so PHP thinks the $foo object is still needed and doesn't release the memory for it ... or something like that. I am sure to be displaying my ignorance here, but the general idea is the same: a refcount is not decremented, so some memory never gets released.

I get from the bugs.php.net link above that modifying the garbage-collection process to check for this kind of issue would be a performance-killer, and what little I know about refcounts makes me think this is true.

Instead of changing the garbage-collection process, would it make sense to have unset() do some extra processing on the variable to look for internal objects and unset them too? (Or perhaps to call __destruct() on objects being unset?) Maybe a PHP internals person can comment here or elsewhere on the practicality and/or wisdom of such changes.

UPDATE: Martin Fjordvald notes in the comments that a patch from David Wang for garbage collection does exist and is under consideration. (Did I say "patch"? More like "whole sheet of cloth" -- it's huge. See the CVS checkout instructions at the end of that email.) Problem is that it didn't garner many votes for inclusion in 5.3. A nice compromise solution might be to have the unset() function call the __destruct() method of objects sent to it; that would seem intuitively appropriate here.

Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving your codebase, all while keeping it running the whole time.