This is one of the most common questions I receive when introducing forecasting. Don’t we need to know the size of the individual items to forecast accurately?

My answer: Probably not.

It depends on your development and delivery process, but often system factors account for more of the elapsed delivery time than different story sizes.

Why might story point estimation NOT be a good forecaster?

Consider commuting to work by car each day. If the road is clear of traffic, then the distance travelled is probably the major cause of travel time. At peak commute time, it’s more likely weather and traffic congestion influence travel time more than distance alone. For software development, if one person (or a team) could get the work and be un-disturbed from start to delivery of a story, then story point effort estimates will correlate and match elapsed delivery time. If there are hand-offs to people with other specialist skills, dependencies on other teams, expedited production issues to solve or other delays, then the story size estimate will diverge from elapsed delivery time.

The ratio between hands-on versus total elapsed time called “process efficiency.” Often for software development this is between 5-15%. Meaning even if we nailed the effort estimates in points, we would be accurately predicting 5-15% of elapsed delivery time! We need to find ways to accurately forecast (or remove) the non-work time influenced by the entire system.

This is why using a forecasting technique that reflects the system delivery performance of actual delivered work is necessary to forecasting elapse time. To some degree, traditional story point “velocity” does represent a pace including process efficiency, but it has very little predictive power than story counts alone. So, if you are looking at an easy way to improve process efficiency, dropping the time staff spend on estimation might be a good first step.

Running your own experiment

You should run your own experiment. Prove in your environment if story point estimates and velocity perform better than story count and throughput for forecasting. The experiment is pretty simple, go back three months and see which method predicts the actual known outcome today. You can use our forecasting spreadsheets to do this.

  1. Download the forecasting spreadsheet Throughput Forecaster.xlsx
  2. Make two copies of it, call one “Velocity Forecast.xlsx” and the other “Throughput Forecast.xlsx”
  3. Pick a prior period of time. Say, 3 Months. Gather the following historical data –
    1. Number of completed stories per sprint or week. A set of 6 to 12 throughput samples.
    2. Sum of story points completed per sprint or week. A set of 6 to 12 velocity samples
  4. For each spreadsheet enter the known starting date, the historical data for throughput or velocity, and the sum of all samples (a total of ALL completed work over this period) as the starting story count and velocity (in the respective spreadsheets).
  5. Confirm which method accurately predicted closest to the known completion date.

This experiment is called backtesting. We are using a historical known outcome to confirm out forecasting tool and technique hits something we know to have occurred.

If performed correctly, both spreadsheets will be accurate. Given that, is the effort of story point estimation still worth it?