Today, a friend of mine posted the following unsolicited review of Outlook on Facebook earlier today after changing jobs:
Not surprisingly, this statement elicited a number of positive grunts of approval from our community. Somewhere in their good intentions, Microsoft lost sight of the prize: useful office software.
In contrast, Google realized, that beyond the basic threshold, most of the market doesn’t like the visual clutter in Microsoft’s software. This visual clutter not only distracts from their experience, but also makes it hard to find the 20% of the features that they care about.
The 80/20 Rule Applied To Software Development
As it’s popularly known, the 80/20 rule claims that you get 80% of your results from 20% of your efforts. Many people hear about it, consider it obvious, and then don’t really think how it might apply to their lives. Originally noticed by Italian economist Vilfredo Pareto and dubbed the Pareto distribution, the 80/20 rule was most recently popularized by Richard Koch in his book the 80/20 principle. Forgetting about the 80/20 rule is dangerous, if not reckless.
Schools have embedded “normal distribution thinking” into our everyday life, by focusing on averages. People like averages. If you know the average of a distribution, you can interpret the meaning of any particular data point relative to that average. If I am overweight for someone who is 180cm tall, this is a label given to me, because there is a well known average weight for people 180 cm tall. As a result of being overweight, I can then choose to do something to reduce my weight. Knowing the average is a clear prerequisite. You compare to an average to evaluate “how we are doing”.
In contrast, power law distributions (the geeky formal name for an 80/20 distribution) are largely overlooked. Listen to Richard Koch for an overview of where this applies:
In all of these cases, using an average to understand the above distributions is misleading at best. A small number of observations, i.e. the “20%”, skews the mean/average so that the mean isn’t representative. Moreover, if we try to improve the average, it’s extremely counterproductive – compared to just focusing on the outliers.
Why Don’t We See This Intuitively?
The problem is that the 80/20 rule is counterintuitive to how we think or at least how we are trained to think. Our schooling implicitly teaches us to assume an underlying Gaussian approach to distributions, due to implications of the Central Limit Theorem. A Gaussuian normal distribution can be summarized by its mean and standard deviation. Many of us are quite comfortable with using an average to understand a set of data, and we often use it to help make decisions. As a result of this Gaussian bias, we tend to overfocus on averages.
Apparently, one explanation for a shift from a normal distribution to a power law distribution is scarcity of resources. Based on simulations at kottke.org, it seems that increasing competition for scarce resources gives rise to feedback loops. These feedback loops, in turn, cause a select few, who are very in tune with these signals, to gain access to the resources faster and more effectively than the rest. The normal curve increasingly morphs towards a power law distribution, with a select few standing out significantly compared to a large gray mass of everyone else. The outliers become significantly different than the remaining 80%. The mean rapidly becomes meaningless.
Thinking in terms of averages, when they are not appropriate creates waste in the “lean manufacturing” sense. We are completely blind to it. How many decisions are made based on a comparison to an average, when actually only looking at the extreme values makes any sense at all?
Users Care about 20% of your Painstakingly Crafted Features
80/20 distributions exist in the value software provides. Anecdotally, most users of software only use about 20% of software’s features, based on anonymous data collection techniques. Paul Kedrosky notes, “By some estimates, the average Microsoft Word user regularly employs about 10 percent of that product’s myriad features.” These are typically the “must-have” features for that particular product. The product must contain these features, in order to be functionally useable.
On the right, you see a pretty chart from the Standish Group via James Manning on Lean Software Development. If you only look at the features used “Always” and “Often” it adds up to exactly 20%. This would mean that the remaining features are not really that important to users. In particular, these features probably won’t affect how much value they perceive the software has for them.
Assuming that each feature takes up an equivalent amount of development time, you can spend 20% of the time you actually spent on a particular piece of software, and your users will reap 80% of the benefit of using such a piece of software. If this particular project was breakeven at 100% developed, if only the first 20% was developed and released, the actual profit on the project would have been 400% of the cost of creating it. If anything, we should be removing features, not adding more to existing software, or creating a new product, with a completely different set of “must haves”.
A really good example of this the comparison between Microsoft Word and Google Docs’ word process. Jensen Harris, who works with Microsoft User Experience Team, notes:
Top 5 Most-Used Commands in Microsoft Word 2003
- Paste
- Save
- Copy
- Undo
- Bold
Together, these five commands account for around 32% of the total command use in Word 2003. Paste itself accounts for more than 11% of all commands used, and has more than twice as much usage as the #2 entry on the list, Save. Paste is also far-and-away the number one command in Excel and PowerPoint, accounting for 15% and 12% of total command use, respectively. Beyond the top 10 commands or so, however, the curve flattens out considerably. [Emphasis mine]
There is a clear power law relationship between the features people most need in a word processor. The top 10 features give users the most value. Beyond the above, I would assume Print follows not far behind, as a word processor needs to be able to print out the text prepared. Even Microsofties jokingly admit that Word 5.1 had all the features their users ever needed.
In contrast, you could argue that Google tried only focusing on the most important features to word processing when creating google docs. While Word and Excel power users may raise their nose at such featureless software, for what it’s meant to do, Google docs are good enough.
Each Bug Tells You Where 80% of Your Other Bugs Are
The following video looks at 80/20 when looking at bugs, testing, and arguably could massively increase the effectiveness of both your automated and manual testing:
Bugs occur in clusters. The existence of a bug notifies you, that the code still needs refactoring and rework. In addition to fixing the bug, you can look into improving the code’s structure. A clearer set of interfaces or abstract classes, which capture the essential concepts of the code, will help isolate the code. You remove extraneous lines of code. You also separate each part of the code into something more self contained. Even though functionally there won’t be a change to the code as a result of a refactoring, it’s likely you will discover problems with the code as you refactor.
By refactoring to interfaces, you are erecting virtual “walls” which contain existing bugs. The big problem developers face, particularly in legacy code bases (read spaghetti code), is coupling. Too many global variables and methods, high complexity, all results in unintended consequences in the code. It doesn’t make a difference how many more stars you got that your peers when first learning arithmetic, spaghetti code is complex. Pull one strand of spaghetti code, and half of what’s on your plate moves. And bugs come out of it.
If there is a bug in one feature, there will probably be more. There could be many reasons. Maybe the feature was rushed to hit a release deadline. Maybe the original developer and BA didn’t get along. Maybe the original tester had to go on extended leave, due to being infected with a rare tropical disease that covered her left nostril in puce spots. It doesn’t really make a difference why.
Clearly delineating scope boundaries, especially scoping down and making class members private and only exposing easy to use public interfaces will go a long way to reduce complexity. Most likely, you’ll probably find a few bugs or unhandled edge cases under the floorboards too.
This approach puts TDD in an atypical light. Starting with a test will not allow you to have enough bugs to know where the rest of the bugs are. Ideally, automated regression tests to capture more functional requirements (as test cases). If you do too much testing up front, you’ll lose an important indicator of problems in the software.
Your Features Spend 80% of Your Development Time Waiting in Queues Caused by One Bottleneck
Mary Poppendieck notes:
Typically, less than 20% of the total time is spent doing work on the request; for 80+% of the time the [feature] request is waiting in some queue. For starters, driving down this queue-time will let us deliver software much faster without compromising quality.
From a business performance perspective, waiting time is the biggest source of waste. A local optimization, applied as a constraint can have a massive impact on the whole business. This will only happen, though, if the optimization is applied the to the system bottleneck. As a result, true global optimizations result only from identifying and resolving the slowest part of the system (global scope), and ignoring the rest.
This concept is best explained in Goldratt’s novel The Goal, summarized here, with an analogy to a marching scout troop of boys:
The constraint was identified as Herbie, the slowest moving boy in the troop. He was easily identified by the queue of boys behind him and the growing space (starved queue) between Herbie and the boys in front of him. Alex Rogo tried to make the best of Herbie’s capability by encouraging Herbie and by keeping breaks short. This “exploit” was not enough to meet the goal. [...] The troop was subordinated to Herbie’s walking speed. This kept the group together, but it wasn’t enough to meet the goal. Alex then did a cause and effect analysis to elevate the constraint and chose the root cause “heavy backpack”. He then solved the problem by distributing Herbie’s supplies among the faster hikers. Alex was vigilant to again evaluate the hike for a new constraint. Luckily Herbie remained the constraint but his increased capacity was sufficient to meet the goal.
Helping Herbie go faster had the most signficant effect on the whole troop’s speed. In this case, no other optimization could have had an positive effect, if it was clear that Herbie was the most significant global constraint. Making the fastest scouts run ahead would still not make the whole group go faster.
When Alex Rogo applied similar logic at his manufacturing plant, a constraint which caused a wait time of one hour one step of the process, which was estimated to cost the company $2,100 per hour, actually cost over $1,000,000, merely because it happened to be occurring at the biggest bottleneck experienced at the company. Of course, this is fictional, but the story captures the essence of the problem.The wait caused by that specific machine, the NCX-10, caused pile-ups and delays downstream. The total impact was actually massive. The global impact of all other forms of inefficiency, including employees and machines standing idle, was dwarfed by the financial impact one machine’s idle time.
So What Does This Mean?
From the above, it’s clear that the 80/20 rule in software is about getting rid of the non-essential, and to a lesser extent about doing the 20% really, really well. Get rid of clutter: on your screen, in your models, in your code. Your software will be more fit for purpose. Customers will be happier. You will earn more money.
The real story, though is about scarcity of needed features. We, as software people, are so good at creating lots of mediocre features. Seemingly, it feels like there is more “to go around”, more “abundance”. At this stage of the game, the real skill is about being honest with yourself, being really good at prioritizing, and really good at cutting out less important features, so that no one notices.
Then you really get abundance.
Related links
- The Goal: A Process of Ongoing Improvement by Eli Goldratt
- What Makes Agile special on the key paradoxes that make Agile work
- Knowing your options’ value helps you choose the best sequential path
- Features vs. cycle time
Related articles
- 22 power laws for the networked economy (zdnet.com)
- The Pareto Economy (sgtreport.com)
- Pay attention to power law distributions (jackealtman.com)
- Applying the 80/20 rule to Social Media (domo.com)
- Why power has two meanings on the internet (guardian.co.uk)
Filed under: Lean, Refactoring, Software, TDD Tagged: Central Limit Theorem, Google, Microsoft, Microsoft Word, Pareto principle, Power law, Richard Koch, Vilfredo Pareto