Geoff Greer's site: The Cost of Features

06 Mar 2013

Fair warning: this is a car analogy.

Here’s a graph showing the weight of some car models over the years:

It’s taken a while, but these cars have gained a lot of weight. This trend doesn’t just apply to cars. Boeing’s 737-100 first flew in 1968 and weighed 30 tons empty. A modern 737-900 is 44 tons. Even fighter planes gain weight. The Spitfire MkIA weighed 1,953kg empty. Seven years later, the Mk24 weighed 3,247kg.

Is this weight increase a bad thing? Usually not. Designers don’t add weight without a reason. Cars are much safer than they used to be. They’re more comfortable. They have more features: air-conditioning, power steering, automatic transmissions, airbags. Likewise, a modern 737 can fly farther while carrying more passengers and cargo. A heavier Spitfire can carry more armament and fuel for a bigger engine. It’s a trade-off: more weight for more features.

Why am I talking about airframes and cars?

These graphs show line counts for some popular open source projects. Here’s httpd:

Node.js (this one was a bit of an outlier):

and Cassandra:

These examples aren’t cherry-picked. I ran the numbers on other code bases, but showing them all would take up a lot of space and increase the page load time. The data is pretty clear: software projects get heavier over time. This shouldn’t surprise many developers. Zawinski’s Law was coined decades ago.

But why do cars, airframes, and software grow? Naively, we should expect at least some of them to shrink over time.

I don’t know the answer, but I do have some hypotheses. Engineering involves trade-offs. All else equal, more code means more bugs. Of course, all else isn’t equal. More features means more code. Improving performance through the use of complex cache hierarchies, indexes, and efficient data structures requires more code. Making software reliable, distributed, and scalable takes boatloads of code. Until recently, rounded corners on a web page required comical amounts of code.

So we have some good reasons for adding code, but that’s not enough to cause software bloat. We could prevent bloat by removing about as much code as we add over time. Why don’t we do that? Several reasons come to mind:

It can break dependent software. While the feature on the chopping block may not be popular with users, other popular software might be dependent on it.
Disgruntled users. Removing a feature used by only 1% of users guarantees a deluge of hate mail. 10,000 users means 100 angry emails. At the same time, it causes the other 99% to wonder if their pet feature is next.
Removing things isn’t fun. If your code is clean, it’s trivial and boring. If your code is ugly, it’s a giant pain. Either way, building something new is more enjoyable.
It doesn’t impress colleagues. “I removed some old code” is rarely said with pride in a stand-up. People don’t brag about removing features or old code.

So removing code is a tough bullet to bite. But then how do we stop software projects from collapsing under their own weight? The best answer I can come up with is: You can’t. The battle is lost as soon as you type git init. Why?

Again, we can look to cars. Car companies have massive budgets for designing cars, but they don’t reduce weight. They have a different strategy: as a model increases in weight, capabilities, and price, manufacturers introduce a smaller model. The Honda Civic used to be Honda’s smallest car in the US market. A decade ago, they introduced the Fit. Volkswagen’s Polo filled the role of the old Golf.

A similar process happens to software. As a piece of software becomes more complicated and bloated, newer projects come along to fill the old niche. Firefox replaced Netscape and Internet Explorer. On Debian and Ubuntu, dash replaced bash. Nginx, Node.js, and others have encroached on the territory of httpd.

I said before that engineering involves trade-offs: more code for features, performance, and scalability. So what about making the same trade-offs in the opposite direction? Going back to cars: What do you get if you remove features to save weight? Even with modern emissions and safety requirements, it’s possible to build a car with the weight of a compact car from 30 years ago. The result is the Lotus Elise. It’s not a car for everyone, but it’s spectacular in its niche.

Are there are software projects that follow the same pattern? Do people use modern languages, libraries, and techniques to build a small piece of software that does one thing well? This is the Unix philosophy, but when I try to think of examples, not much comes to mind. Even Ag has grown to incorporate more features (although this hasn’t made it slower).

This post is pretty long and I don’t have any overarching conclusion to wrap-up with, so I’ll just end it here. I know it’s bad form, but this isn’t getting published in The New Yorker. Just a reminder: This rambling is not even close to scientific. I would delve into the topic more, but I’m rather busy with work these days. On the bright side, never before have I been this productive, as my GitHub stats show.