Problem in a nutshell: Historical data about delivery pace (throughput or velocity) contains work that isn’t part of the original backlog causing it to be a poor predictor of future delivery pace of planned work. Should we just capture planned items? Or Defects and un-planned work as well? Yes, capture it ALL, but be selective what data to use when forecasting.
Historical data is important for forecasting. We need to capture how the entire system has operated historically to avoid being misled by our own cognitive and wishful biases about how the future might unfold. The issue arises how we choose to use that data when not all of it is applicable for some reason. Defect work and un-planned work are common reasons some data should be omitted when forecasting some measures.
A common issue and most common question I get asked when using historical data is “should we count defects or interrupt (non-feature) work?” My answer is always Yes, you should capture every bit of data you can if its passively attainable (meaning, little or no extra effort). This does not mean you should use it all when forecasting.
When capturing item data, here are the ones I think are a must –
- The date that item was committed to being work on. This is a more important date than the date it was created in any tool, it might be the date it was pulled into a committed queue, or the date it was put into a sprint backlog.
- The date the item was delivered (ideal) or completed ready for delivery (ok, and most common).
- Whether the item was planned or un-planned work. This varies. My most common categories are planned, un-planned, defect.
These inputs are the three our free Throughput and Cycle Time Calculator use which you can download it here (Excel spreadsheet, no macros)
This spreadsheet allows you to customize what value in the original system maps to planned and un-planned work in the spreadsheet. It doesn’t handle both un-planned and defect, so not perfect and it will be updated to do both.
Different type of items help estimating and forecasting different things. Here is how we suggest using the data from our spreadsheet:
To forecast how long to deliver the rest of the feature or project backlog: Use the Planned item throughput adjusted by the observed split rate.
To estimate the planned item split rate: Use the amount of original planned backlog count divided by the planned completed backlog count (see our article on this here).
To estimate defect rates: Use the Defect items throughput rate. To calculate how many defects per planned item, divide the defect throughput by the planned throughput.
To estimate the interrupt or un-planned work rate or cycle times: Use the un-planned items throughput and cycle time.
A different throughput (items per week/sprint) shown for planned and un-planned work. Forecasting planned work? Use the planned throughput!
Its important to understand how different items are ordered and prioritized by the teams through intentional policy. Making sure everyone knows why (and when) some types of work should be accelerated keeps the right work finishing earlier. Keep collection as light as possible, and make the team invested in proper classification and sign-off of the key dates by making the data accessible for process improvement and discussion. A good outcome of these discussions is a proposed change in policy to keep the most important items completing when needed.
Get the Throughput and Cycle Time Calculator spreadsheet we used in this article which you can download it here (Excel spreadsheet, no macros)
(its free, and uses no macros so its generally safe. Please consider answering the survey and sending us anonymous data so that we have a wider set of data to share with researchers.)