Paul M. Jones | The Miserable Mathematics of the Man-Month

The Miserable Mathematics of the Man-Month

We've all heard of Fred Brooks' law regarding the mythical man-month. (If you have not heard of it, stop reading this and go read The Mythical Man Month). The rule is this:

Adding manpower to a late software project makes it later.

He concludes his essay with these words:

The number of months of a project depends on its sequential constraints. The maximum number of men depends on the number of independent subtasks. From these two quantites one can derive schedules using fewer men and more months. (The only risk is product obsolescence.) One cannot, however, get workable schedules using more men and fewer months.

Once a development project has started, and appears to be taking longer than desired, adding more developers will make the project more-late. Brooks explains that there are two reasons for this:

It takes at least one developer, often more, to brief, train, educate, and otherwise inculcate the new developers into the project. That developer is unproductive during this time, and so are the new developers.
The amount of communication between everyone on the project increases exponentially with each new developer added, but the amount of productivity only increases linearly.

But exactly how bad is the falloff in productivity? "Adding people will make it more late", but just how much later will it be? Is there a way to predict the schedule effects of adding developers?

Point 1 (the training of the new developers by one or more existing developers) is relatively easy to model. If you add one new developer, and it takes one existing developer a week to train him, then you just added at least a week to the schedule when neither of them is productive. The existing developer got no productive work done, and neither did the new one.

After that, the developers have to make up the lost time. To discover the effect of point 2 above (the communication costs involved once a new developer is in place), I plugged some numbers into a spreadsheet. The communication costs are quite dramatic.

Below is a table representing a one-developer project, along with the communication costs and resulting schedule compression when adding developers. I'll give some narrative about the table afterwards, and then close with some very strong caveats.

Do not interpret this as science; at best, it is an initial guide to expectations, subject to further revision as I explore the topic more deeply.

working devs	comm. links	dev. + comm	prod. rate	prod. output	time factor
1	0	1	1.00	1.00	1.00
2	1	3	0.67	1.33	0.75
3	3	6	0.50	1.50	0.67
4	6	10	0.40	1.60	0.63
5	10	15	0.33	1.67	0.60
6	15	21	0.29	1.71	0.58
7	21	28	0.25	1.75	0.57
8	28	36	0.22	1.78	0.56
9	36	45	0.20	1.80	0.56
10	45	55	0.18	1.82	0.55

In the first row, we have the baseline case. One developer doesn't have to talk with anyone else, so there are no communication costs. By default, his productivity rate is 100% of whatever he would normally do, so his output is also at 100%. If the one-man project is estimated to take 10 weeks, then it takes 10 weeks (a time factor of 100%).

Now, let's say we add another developer. The first thought is that the project should now take half as long (5 weeks). But! Although we have now two developers, they have to spend time talking to each other and coordinating their activity. That means we have 2 "units" of development happening, but an additional "unit" of communication. Some of the combined effort of the two developers is spent talking to each other. Instead of getting 100% more productive output (two developers instead of one), they are together only about 33% more productive. That means the remaining project time won't be cut to 50%; instead, it will be reduced to 75% at best. (And to boot, if we spent a week training the new developer, it means we need to add that week to the schedule as well.)

It only gets worse from there. 3 developers means 3 lines of communication (i.e., each one has to talk to the other two developers). That will cut the remaining project duration down to 67% of the previously scheduled time. Adding the third developer only gained us 8 percentage points of duration savings over having two developers. At this point we have tripled the cost of development, but only save 33% of remaining project time, not counting the time lost in training the new developers.

4 developers is 6 lines of communication, and a schedule compression of 63% (i.e., only 4 percentage points better than 3 developers). 5 developers is 10 lines of communication, for a schedule compression of 60%. No matter how many developers we add to the single-developer programming job, it looks like we will never cut the remaining project time in half.

I find this rather depressing.

Now, some very strong caveats:

We presume the work being performed in the situation modeled above has to be done sequentially. If the work can be done concurrently or in parallel without additional planning, training, or communication, we can probably ignore the diminishing returns indicated by the table, since the developers don't have to talk to each other.
The amount of time spent in communication is assumed to be equal to the amount of time spent developing work product. We can game this a little bit by changing the "communication" column to some percentage of the number of communication links. This would show much better numbers, but my intuition tells me that it would not map to reality; it is my experience that just as much time is spent coordinating (and then regaining "flow") as is spent developing work product.
Each developer is presumed to be roughly as productive as each other developer. However, even the most productive developers will eventually be overwhelmed by the volume of communications, and if they spend time training new developers, their productivity drops to zero for that period.
The developer(s) being added are presumed beforehand to be familiar with the project, its history, and its idiosyncrasies. If the developers are not already so familiar, it means the one or more of the developers already working on the project has to stop working and spend time bringing the new developer(s) up to speed. That makes the productivity reduction even more dramatic, and is a key factor in making the late project later.

Finally, some conclusions:

I think caveat #4 above is really important in relation to Brooks' Law. It makes me think that the above table probably describes the best case scenario when adding developers; i.e., if nobody has to train the new developer, then you don't lose the training time, but you still don't get twice as much productivity from adding a second developer.
Caveat #4 also implies to me that, if we want to compress the development schedule, the time to add developers is at the beginning of a project, not later on, so that they all learn the project as they go. But even then, it won't result in enough schedule compression to warrant adding more than one or two extra developers to a single-developer project.

I said earlier, and I'll say again now: This is not science. It is the beginning of an exploration of how to reliably manage resources and make schedule predictions. If you have questions, concerns, insights, alternative analysis, or experiences you would like to share on this topic, please leave a comment below.

Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving you codebase, all while keeping it running the whole time.