Line Length, Volume, and Density

Update: This entry seems to be getting a lot of new attention; welcome! The lessons of line length, volume, and density, along with lots of other good design principles, are applied to the Solar Framework for PHP 5. Be sure to give it a look if you’re interested in well-designed PHP code.

When it comes to coding style, there are are various ideas about how you should write the individual lines of code. The usual argument is about “how long should a line of code be”? There’s more to it than that, though. Developers should also take into account line volume (”number of lines”) and line density (”instructions per line”).

Line Length

The PEAR style guide says lines should be no longer than 75-85 characters. Some developers think this is because we need to support terminals where lines may not wrap properly, or because some developer screens may not be big enough to show more than that without having to scroll sideways, or because it’s tradition, and so on. These reasons may even be accurate in some sense. However, I see the 75-character rule as recognizing a cognitive limitation, not a requirement that can change with available technology.

How many words per line can a person scan, and still be able to grasp the content of the line in the context of the surrounding lines? Printing and publishing typographers figured out a long time ago that most people can read no more than 10 to 12 words per line before they have trouble differentiating lines from each other. (A “word” is counted as five characters on average.) Even allowing for a 25% to 50% increase, that brings us up to 15 words. Times 5 characters per word, that means 75 characters on a line.

So the style guide limitation on line length is not exactly arbitrary. It is about the developer’s ability to effectively scan and comprehend strings of text, not about the technical considerations of terminals and text-editors.

Line Volume and Density

Some developers believe you should put as much code as possible on a single line, to reduce line-count. They say this makes the code read more like a “sentence”. In doing so, these developers trade line “volume” for line “density” (or line “complexity”).

Increasing the density of a line tends to make it less readable. Lines of code are generally lists of statements, not natural-language prose. If you put a lot of instructions on a single line of code, that tends to make it harder for other developers to decipher the logical flow.

Examine the following:

list($foo, $bar, $baz) = array(Zim::getVal('foo'), Dib::getVal('bar'), Gir::getVal('baz', Gir::DOOM));

(Yes, I have actually seen code like this. Only the identifier names have been changed.)

Now compare that to the following equivalent code:

$foo = Zim::getVal('foo');
$bar = Dib::getVal('bar');
$baz = Gir::getVal('baz', Gir::DOOM);

When I showed this rewrite to the initial developer, his complaint was: “But it’s more lines!”.

Increasing line volume (”more lines”) and reducing line density does three things:

  1. It reduces line length to make the code more readable.

  2. Making it more readable makes the intent of the code more clear. The logical flow is easier to comprehend.

  3. In this particular case, it may be faster than the original one-liner, because it drops the list() and array() calls. True devotees of the Zend Engine will be able to say for certain if this translates into faster bytecode execution. (I am not a fan of speed for its own sake, but in this case it would be good gravy over the meat of the above two points.)

In reducing line density, you don’t have to make one line correlate with a single statement (although usually that’s a good idea). Here’s another way to rewrite the original example, this time as a single statement across multiple lines:

list($foo, $bar, $baz) = array(
    Zim::getVal('foo'),
    Dib::getVal('bar'),
    Gir::getVal('baz', Gir::DOOM)
);

I find this less readable than the initial rewrite, but the principle is the same: more lines, but shorter, to improve readability.

Balancing Considerations

If shorter lines are better, does that mean lines should be as short as technically possible?

$foo
=
Zim::getVal(
'foo'
);

$bar
=
Dib::getVal(
'bar'
);

$baz
=
Gir::getVal(
'baz'
,
Gir::DOOM
);

It looks like the answer is “no”. The line-volume vs. line-density argument is about readability and comprehension. The above example, while absurd, helps to show that overly-short lines are as difficult to read as over-long ones.

Developers with good style balance all the considerations of line length, volume, and density. That means they write lines of code no more than about 75 characters long, but not so short as to be increase line volume without need. They also show attention to line density for reasons related to cognition and comprehension, not merely technical syntax.

18 Responses to “Line Length, Volume, and Density”

  1. Helgi Þormar Þorbjörnsson Says:

    I couldn’t agree more with this, there are number of projects where I’ve been coming in on late stages (open source and corporate alike) where I’ve beefed up their coding standard and namely this part, line lengths, the flow of the code and amazingly everyone started to understand the code in a jiffy where in the past it was considered a PITA to explore forgotten code.

    This rule + return errors early, i.e. not doing

    if (!isError()) {
    do all the fun stuff;
    } else {
    return error:
    }

    do;

    if (isError()) {
    return error;
    }

    do fun stuff;

    It’s just mind blowing how much more understandable the code becomes if you just follow the 75-85 char rule and return errors early rule

    Good post!

  2. Rob Young Says:

    Nice post, well said. Helgi, that’s very similar to what I was told early in my career. The smallest side of a conditional block go first so that your eye doesn’t have to train over a large amount of code to find the other side of the condition. So rather than

    if (condition) {
    lots and
    lots and
    lots of
    code
    } else {
    a little code
    }

    do

    if (condition) {
    a little code
    } else {
    lots and
    lots and
    lots of
    code
    }

  3. Nigel James Says:

    Great post Paul. I think you hit the right balance and for good reasons.

    Whitespace and more lines make for readable code so while you can do this:
    for($i = 0, $max = sizeof($myarray); $i < $max; ++$i) {}

    i find this more readable
    $max = sizeof($myarray);
    for($i = 0 $i < $max; ++$i) {}

    Personally I dislike have to scroll way to the right to have to find the end of a line of code.
    Cheers,
    Nigel

  4. PHPDeveloper.org Says:

    Paul Jones’ Blog: Line Length, Volume, and Density…

    In a new blog post, Paul Jones looks at three aspects ……

  5. Brian Says:

    I’ve been reading Steve McConnell’s “Code Complete” and he really has a lot to say about this. The longer I’ve been working in the software world, the more I’m convinced that readability (and therefore maintainability, etc) are far more important than speed or “cleverness.” This is especially true if you’re having to work with multiple languages.

  6. blog.ekini.net Says:

    i feel guilty about writing “dense code”…. thanks for the reminder. although i can understand my code - comments, etc… others might say otherwise if they read my code.. again, thanks…

  7. Paul Jones’ Blog: Line Length, Volume, and Density | Cole Design Studios Says:

    [...] a new blog post, Paul Jones looks at three aspects of coding style - line length, volume and density - and how [...]

  8. Lunghezza, volume e densità: questione di stile : php5blog.it Says:

    [...] Leggo dal blog di Paul M. Jones un’interessante spunto di riflessione sullo stile di programma…, nel senso stretto del codice realizzato, in cui si parla dei parametri e delle metriche utili alla definizione dello stile in esame. Oltre alle sue idee aggiungo qualche appunto personale dettato dall’esperienza personale, attendo fiducioso anche i vostri commenti. [...]

  9. Speed Vs Readability Says:

    Hi,

    This is a good post in terms of ‘Readability and Maintainability’. But no idea of speed related issues. In high traffic applications, code refactoring done to improve performance/speed when cpu hitting the roof(Post Production).

    Here is sample:

    if(isset($name)){
    echo ‘Mr’.$name;
    }else{
    echo ‘Name missing’;
    }

    Five lines of code - More volume and less density
    ———–
    The same is written as:

    echo isset($name)?’Mr’.$name:’Name Missing’;

    Only one line of code - More density and less volume.

    Which is good in terms of ’speed’?
    Which is good in terms of ‘Readability and Maintanence’?

  10. PHP Weekly Reader - May 16th 2008 : phpaddiction Says:

    [...] Jones has a nice write up about Line Length, Volume, and Density and how it affects code readability and maintainability, along the same linesI found this article [...]

  11. nate Says:

    Well Jonesy, I guess we each have our own cognitive limitations. ;-)

  12. Pat Says:

    One good article “About Coding Styles” is at
    http://www.tamk.fi/~jaalto/course/coding-style

  13. Jyot Vakharia Says:

    Hi,
    I have been dedicated to PHP development for a while. As far as the coding standards are concerned, I am pretty happy that there exists some standard, which regulates how the code appears. Nevertheless, I am a bit skeptical on the effect of the standard following on the performance of the script. Addition of spaces instead of tabs makes the code only long. Though I might be wrong, when it comes to loading the same script a thousand times, it should effect the performance of system. Correct me if I am wrong though.

    Thanks

    Jyot Vakharia

  14. pmjones Says:

    Hi Jyot –

    The addition of whitespace (and comments!) does **not** affect performance significantly. (Yes, that came as a surprise to me, too, when I first heard it, but it is true.)

    Even better, when you use a bytecode cache, the whitespace and comments are not retained in the bytecode, so they have zero performance impact at that point.

  15. Coding standard, coding style | Kristian Lunde Says:

    [...] http://paul-m-jones.com/?p=276 [...]

  16. Izkata Says:

    How could whitespace and comments affect code? It’s all stripped out before it gets compiled, and doesn’t exist in the program itself.

  17. Anthony Gentile Says:

    Re: Speed vs Readability:

    The ternary is actually more opcodes than the if else block. Ternaries are nice for that one liner, but make it much more difficult to understand what is going on…especially when they are nested (please dear lord no). Honestly…if you are doing super duper millisecond optimizations…you are probably not spending your time very well. Unless you are working on a heart monitor or some other critical application that requires some unreadable cleverness(comment it)…go with readability.

  18. Marcus Bointon Says:

    Another factor that contributes to the poor readability of long lines is that many editors have utterly useless soft line wrapping, giving you a non-choice of trashed indenting or horizontal scrolling (emphasised by wasteful single-window UIs). Eclipse and Netbeans fail completely, and even the almighty vim makes a mess of it (though there’s probably a patch…). BBEdit does it beautifully; lines can be as long as you like, and indenting is preserved dynamically. This is especially useful when you’re dealing with files that need long lines, such as CSV or SQL dumps, but you want to do without the pain of horizontal scrolling. Another fly in the ointment is the annoyingly persistent IE6, which doesn’t handle whitespace in HTML properly, necessitating long lines on occcasion.

    I find it completely bizarre that such a basic feature of text editing is missing in so many editors. The tabs vs spaces thing falls into the same camp; 1 indent = 1 tab makes total sense, and using multiple spaces doesn’t, yet we’re stuck with it because editors are randomly crap and lowest common denominator prevails.

    This doesn’t necessarily have anything to do with deliberately restricting line lengths for the sake of readbility - sometimes I find that splitting lines unexpectedly makes things harder to read - especially if peculiarly indented. There’s clearly a balance to be struck for readbility (the point of this article), but it should be for the benefit of people, not their possibly inept tools.

    Beyond that, I also find it strange that more editors can’t enforce/apply coding standards automatically. It could save a lot of time wasted on formatting that could be spent coding.

Leave a Reply