Multiple Feature Cut Line Forecast Spreadsheet

Posted by in Featured, Reference, Tools |

Forecast what features will make a release deadline and experiment with scope and start order. Do this early in the planning process to make better decisions about feature scope and start order by showing what will make a target delivery date with a simple tick (yes) or cross (no) for a set of desired features. Often referred to as the cut-line, this spreadsheet shows what will make the cut and what won’t. Often seeing what won’t make a release helps everyone look harder at whether the order matches business need.

Although we have kept this spreadsheet simple on the input side, it performs a Monte Carlo simulation for each feature (1,000 hypothetical deliveries) just like the Single Feature Forecaster spreadsheet. This spreadsheet uses the completion date for each feature as the start date for the next. It then shows which features will be delivered on or before a target date.

Download this spreadsheet from here:  Multiple Feature Cut Line Forecaster: I want to forecast multiple features at once

Multiple feature forecast inputs and results. Tick = yes, Cross = No hope

Multiple feature forecast inputs and results. Tick = yes, Cross = No hope


The Forecast Inputs

1. Start date: The calendar date where work will actually begin for the first feature. This shouldn’t be the “planned” start date, it should be the date when you are sure work will start. Look at other project that might be delayed or other reasons key staff might be unavailable to start when deciding what this date should be. We find the start date is often overlooked as an easy estimate to make, its not. Its one of the few that involve thinking about company wide factors rather than just the project or feature work and the people in the room

2. Target Date – This is the calendar date that decides whether to show a tick or cross. If the likelihood of a feature delivery date (set by 3. Likelihood) is on or before this date, that feature is said to make the deliverable and adorned with a green tick-mark. Otherwise it will be given a red-cross mark. This should be the date that code is tested, packaged and ready to enter any final deployment pipeline. Often this date is before the product is released to market, its a hand-off date to operations.

4. Split Rate – We find that a big cause of forecast accuracy is using a delivery rate (throughput or velocity) based on work that has been split into multiple parts when started. It makes the team appear to be delivering faster than it is. Our most commonly seen settings (and suggested if no actual data exists) is a low estimate of 1 (not split) and a high estimate of 3 (a backlog item is split into three parts for delivery). Check if the throughput and velocity estimate or data includes defects. If it does, bump these estimates up to model the most common defect rate rate range so each item embeds its allocation of defects. For example, if every piece of work get one defect, start with a low estimate of 2 (1 for the work + 1 for a defect).

5. Throughput Estimate or Data – The throughput estimate is where you express the delivery pace of the team. This is the rate that  work is delivered in the same units estimated for each feature. Velocity in points or throughput (count of items completed over time) can be used, the only requirement is that the same units be used for item estimates. You also have the choice here of whether to use a low and high bound guess, or to use historical data (where samples are placed in the Throughput worksheet. Our advice is to start with a wide estimate (spread out low and high value). What is important is the actual values of throughput and velocity found out through delivery fall within this estimate, not to have a narrow estimate. When real data starts to flow in from the team, switch over to using it instead of the guess. Normally about 11 samples are needed before i’m comfortable changing to use data. Try both, and see when the forecasts begin to agree, then use the data samples.

7. The Features and Estimates – This is the part of the spreadsheet where you describe the feature (by name) and get a low and high guess of the amount of work for each feature. Similar guidance here to entering the throughput estimate, start wide and narrow as real data and lessons allows. Remember to consider the split rate in input 4. Don’t double count splitting or defect inputs. Choose either to account for scope creep and defects in these estimates or through the splitting estimates – be consistent and explicit! The first column, start order is for convenience. Sometimes when some features are going to miss a release date, experimenting by changing the start order gets the “agreed” features the best chance of making the cut. Its provided here so that immediate results are shown during discussions and meetings (we used to cut and paste rows, but it was slow and error prone, so we added the ability to control order by column A).

8. Month Adjustments – We found some major holiday events, seasons, or company conferences made significant differences to throughput rate. These inputs allow the throughput or velocity estimates to be adjusted depending on which month work falls. The spreadsheet multiplies its random pick for delivery pace by the multiplier you enter for each month if work falls in that month. Its fine tuning mostly. In Europe we find summer pace slows. In the US we find less fluctuation. One key insight we see if a companies vacation policy matters. For companies that have a “use it or lose it” policy for vacation time, the last month (often December) is a fraction of other months for throughput. Ask and look for reasons one month might fluctuate. Re-orgs, team hire and ramp up, conventions and conferences are some of our common reasons for using these adjustments.

The Forecast Results

Screenshot 2016-03-30 11.24.35

The Forecast results are shown to the right of each feature input. Its designed to be simple enough your manager can understand it. We show the start date because its important that any date we show in the table assumes this is the start date; miss that date, then the forecast is incorrect.

Given the start order in column A for each feature, a green tick means on or before the Target date (given the likelihood percentage confidence entered as input 3, 85% in this case). A red cross means it will miss by at least a week (or 2 weeks depending on the time interval entered in the throughput input as part of input 5.). An orange exclamation mark means within one throughput period (on the fence). We encourage our stakeholders to look at what is going to miss and offer to trade something that is going to be delivered. This experimentation is easy by changing the start order in the column A adjacent to each feature. The start orders must be non-duplicated, ascending order from 1 to 10 (errors are shown if you get this wrong).


This tools aims to help early and vigorous discussion about scope and start order of multiple features competing for the same delivery team. Use it to help others see likely outcomes and make hard decisions earlier. Although we have more complex forecasting tools, this is often the right tool for early exploratory discussions about what resources are needed and what set of features will make the cut given teams historical pace of delivery.

Download this spreadsheet from here:  Multiple Feature Cut Line Forecaster: I want to forecast multiple features at once

Love to hear feedback on the type of discussions it causes and suggestions for improvements:


Read More

Monte Carlo Forecasting Exercises

Posted by in Featured, Forecasting, Tools |

Monte Carlo forecasting is simpler than most people believe. To prove this we teach the basics using paper tutorials and dice. We make these tutorials publicly available for you to use to teach others within your company (or at home with your kids).  We welcome all feedback on what works and what doesn’t.

Basic Monte Carlo Forecasting Tutorial

Get the Basic Monte Carlo Forecasting Tutorial here.

“Discover what Monte Carlo forecasting is by performing it by hand. This exercise simulates completing a project many times and plots the outcomes. Perform 9 more trials. Each trial involves filling all rows in a column until the remaining work count reaches zero.”

This tutorial is the simplest burn-down style Monte Carlo forecast. It simulates 10 trials starting at 20 stories being completed at a rate determined by a roll of a six-sided dice. The burndowns are plotted on a chart to demonstrate that each trial takes a different route to completion.

Basic Monte Carlo Exercise

Perform ten Monte Carlo forecast trials and then chart the results.

Basic Monte Carlo Chart Sheet

Chart the burndown results of ten Monte Carlo forecast trials

Throughput Monte Carlo Forecasting Tutorial

Get the Throughput Monte Carlo Simulation Tutorial here.

This is a more advanced tutorial. It has extensive instructions on its opening page, and a completed example as the last page. We use this in our training courses, and often play this with all staff during our forecasting engagements. It get students to perform 3 of 10 simulation trials (7 are already performed) using historical throughput samples (given to you). It walks the students through the mathematics of computing percentiles on result trials to help people understand how to interpret 85th percentile for example.

Throughput Monte Carlo Forecasting Tutorial

Throughput Monte Carlo Forecasting Tutorial

There are many teaching moments during this tutorial. The most common is why is all this “complexity” necessary. We answer that in the document –

“Historical throughput data for teams measures delivery rate for a wide portion of the development system (the wider the better). Team throughput per week accounts for delays; for example waiting time, impediments, staff availability, interruptions and un-recorded work. The impact of these delays is more significant to a forecast than the hands-on time alone. This is a reason developer estimates are unreliable when forecasting projects, they don’t account for delays and system dynamics. In a stable system (e.g. the team isn’t blown-up), throughput will be a good predictor of future delivery rate even with large item size variability.”

We will add more of these over time as we learn ourselves how to teach the methods we use and why they give superior results than dividing average rate of work delivery by an average amount of work remaining.


Read More

Forecast Agile Project or Feature Spreadsheet

Posted by in Featured, Reference, Tools |

This spreadsheet helps forecast how long a single feature or single project will take to deliver using agile (scrum or kanban) using Monte Carlo simulation. We have to balance keeping the spreadsheet light to use and easy to understand whilst capturing the most likely factors that cause feature or project blowout.

We have made our Single Feature Forecaster spreadsheet available for free for over six months, and get a lot of feedback and questions about how it works and how to use it. This post tries to answer how to use it. Future post will cover how it works in detail.

Get the spreadsheet here: Single Feature Forecaster Spreadsheet (latest version) 

(uses no macros – all formulas, requires Microsoft Excel or Google sheets (but it is slow))

Single Feature Agile Forecasting Inputs

Single feature forecaster inputs

Forecast a single feature spreadsheet inputs

1. Start date – seems relatively straightforward, but don’t be fooled. Poor choice of start date is a major factor in poor forecast accuracy. Often the planned start date is missed, staff aren’t moved from prior projects, or a final decision to begin doesn’t arrive in time. Be pragmatic. Too optimistic or pessimistic is will cause errors. Consider other projects that may run long, these might impact on the start date for this feature or project.

2. How Many Stories – this is the base amount of work to be performed for this feature or project. Its is a range estimate, and our advice is to keep it wide. Monte Carlo works best when it is more certain that the actual final range is within the estimate, so making this range narrow increases the risk of a miss. Include any item count that is included in your throughput samples of estimates. If your throughput includes defects, then estimate defects. Our advice is to NOT include defects, to just include work contributing to the features. Also, don’t include risks, these are best managed separately (in the Risk tab).

3. Story Split Rate – this range estimate wasn’t in our first release. What we encountered is the mistake (that all of our competitors make) when historical throughput data is used to forecast. Items in the backlog are often split before being delivered. If the throughput count is post-split (items are split into multiple parts or grow in scope), the forecast will be out by factors of up the three times (if stories split on average into three parts, which is common in our experience). This is a significant miss, so we added it to the spreadsheet. Our advice is to split low estimate 1, and high estimate 3 until you have more accurate data.

4. Throughput – The rate that work will be delivered. The only requirement is that it is in the same units as the story estimate, whether that be story points or count. There are two ways to give this estimate. Use a low and high range guess if no historical data is available, or historical data if it is available. If a range estimate is used, keep it wide. You are estimating the rate that the story estimate given in input 2 and 3. If that includes defects, then this estimate should as well. We find it easier to NOT include defects in either the throughput or story estimate. Start wide pay particular attention to the low estimate and be realistic and use the lowest you have seen (don’t be afraid of 0 or 1 until you get more information). Don’t be to pessimistic on the high bound. Go one higher than you expect or have seen. Remember to set whether your estimate is per week or two weeks. We like to keep data per-week. If you choose to use historical data, enter your samples in the Throughput Samples tab and choose “Data” from the drop-down list.

The Results Table

Single Feature Forecaster Results

Forecast Single Feature Results Table

Forecast results are calculated continuously. We have tried to suggest that 85th percentile and above is safe territory. What we have found traditional estimation methods relay on averages. Although not exact, the 50th percentile often matches what is expected and although (much) more realistic, the 85th percentile is a shock. Be ready to defend the process used. To help, we have added some charts that pictorially help explain how it works.

Forecast Result Charts

Simulation Run Lightning Chart

Forecast Single Feature Burndown Lightning Chart (for 50 of 1000 simulations)

The lightning burndown chart shows the first fifty simulated project burn-downs. When explaining the process used to others, explain that this spreadsheet hypothetically complete the project 1,000 times given the estimates you provide. 85th percentile will be the date that 850 of the simulations completed on or before. Go onto explain that traditional methods would give a date closer to where only 500 of these simulated runs would have completed.

Single Feature Forecaster Histogram

Forecast Single Feature Result Histogram

The other chart displays the histogram of results, up to 20 segments maximum. This is similar to the results table, but we find people just want a date and find this histogram too technical to interpret. When multiple risks are entered into the risk tab, this histogram can get very complex. We will go into much deeper detail in a future post on how to capture project risk factors.


We use this same spreadsheet when doing complex analysis. For us its a great first step to getting to a reliable and defensible forecast outcome and allows us to play with the scope and risk factors to find a close solution. We have more complex tools, but sometimes this answers the main question asked – is it possible.

Feel free to send us ideas on improving it.

Again, Get the spreadsheet here: Single Feature Forecaster Spreadsheet (latest version)

Read More

Excel Unit Testing

Posted by in Featured, Reference |

Errors in formulas in complex spreadsheets can be hard to detect. After a few compound formulas (one formula depends on the output of another), the permutations can be large to enumerate. Traditional software development uses unit testing to help initially write code by initially failing, then passing as code is written, and to confirm code continues to work after re-factoring or seemingly unrelated changes are made. Unit testing spreadsheets seems to be just as valuable.

Our Multiple Feature Cut Line Forecaster spreadsheet is complex It doesn’t use any macros (VB code), it is all formula based. Its complexity is in the detailed formulas and the cascading of one result as an input to another formula. We needed to test we got this right.

We decided to do the lightest test possible and use Excel’s built-in Scenario Manager. The Scenario Manager allows values to be changed in tandem on a worksheet. We pre-built sets of inputs and tested the calculation results against a “human tested” result.

How it works:

1. We defined scenarios that set input fields to know states
2. We defined test rows that compare the current calculated results against an expected result we determined by hand
3. We show a pass or fail for each test row only when the correct scenario is shown

excel unit tests

We were happy with the result. Our tests found errors we had missed, and we occasionally make changes that impact the current tests. Its not ideal, but the effort was light for the confidence we have in our tools. Worth the effort.

Excel Challenges:

1. Scenarios are hard-coded to a limit of 32 input changes per scenario. We often had more. To solve this some of the tests need multiple scenarios to be applied sequentially to change all the values we needed. A pain, but not a show-stopper.

2. The scenarios can’t make changes across sheets. This means that our unit tests are on the bottom of the Forecast tab. We would like it elsewhere, but this limitation means it has to just be out of sight (rows 50 onwards, at least off the first screen unless you scroll down)

3. No test runner. We have an aversion to macro’s. This means we need scenarios to be shown using the Scenario Manager by hand. Changing the scenario will cause some tests to fail and others to pass. This looked messy. Our fix was to target the pass/fail indicator against a scenario. To achieve this each scenario sets a field value (1, 2, 3, 10, etc). Each test row indicated which scenarios that row targets. This way, we avoid showing an alarming fail when testing another scenario.


Read More

Single Feature Forecast Spreadsheet

Posted by in Featured, Reference, Tools |

We often build tools that help forecast or teach the concepts behind our statistical methods. Turning our internal tools into public property takes some time and tuning. The spreadsheet performs a Monte Carlo simulation to generate a delivery date forecast for a single feature. There are no macros, everything in this spreadsheet is based on formulas.

Features –

  • Given a start date, story count range estimate and throughput/velocity it gives date and probability forecasts
  • Throughput can be story count based or velocity based
  • If historical throughput/velocity data is available, it can be used instead of a simple range estimate
  • If risks are known, they can be used to forecast their impact – totally optional
  • Charts so you can see the analysis visually – just for teaching
  • NO MACROS – No security issues to have to worry about. We only do what Excel does in formulas and don’t share or send any data externally.

Get it here –

Throughput Forecaster.xlsx

(See all of our free tools here:

How it works –

Given the feature count estimate range, 500 hypothetical completions of this feature are simulated. These simulated trials are used to compute how likely a given delivery date is against others. This analysis correctly computes how the uncertainty in feature size performs against the uncertainty of delivery rate (historically or estimated).

We will be posting hands-on labs about its application in the future. For now, we just ant to make our internal tools available for use by the industry so that we can improve on the simplistic and flawed current methods of forecasting that plague the software development industry.


Read More

SimML Reference Documentation

Posted by in Featured, Reference |

OK, its about time but we have a draft document of our SimML specification. We have tried multiple formats and often struggled to keep the documentation in sync with our rapid releases. We do releases at least every month, and it was difficult to maintain.

We want your feedback on this format:

The latest is ALWAYS publically available on our GitHub resource account. And its part of our definition of done for all work that this is updated. Feel free to keep us honest!

Know that this is a work in progress. We have the model setup section documentated, but not the execution instructions. We use that section mainly internally, so its unlikely you edit it by hand, but we WILL give you that documentation over the next few months anyway.


Read More

Forecasting Error: Not Accounting For Scope Increase

Posted by in Featured, Reference |

Initial estimates of the amount of work for a project or idea lacks detail. Attempting to forecast using historical rates would be in error if:

1. The granularity of work breakdown differs from historical samples.

2. The project isn’t completely fleshed out as to what features are required

3. The project ignores system and operational issues. New environments, changes in current environments, security or performance audits and requirements.

If every project needed to be completely understood with every detail, then forecasting would take too long. The rate of failure in delivery of waterfall projects howed that even attempting to completely design and understand projects doesn’t improve delivery likelihood.

Tracking the increase in scope and the causes for previous projects allows projects at an idea level to forecast with some certainty of likely scope increase. The recommended technique is to keep clear records of the amount of total scope for each project, categorized by work-item type. Some categories we recommend are:

1. Split (straight split of known work)

2. Discovered Scope (scope found only after deling into the detail)

3. New Requirement (nothing to do with the original ideas, added features)

4. Adopted work (work the team took on but isn’t actually part of a project)

By tagging each backlog item with these tags, a growth-rate from an original amount of work can be computed. This adjustment can be applied to new ideas when quantitatively forecasted. These metrics are also good to put targets around. None of these items are bad by default, its just good to know where the scope increases are coming from and to manage/consider them when forecasting proposals.


Read More