Should I capture defect and un-planned work item data?

Posted by in Featured, Forecasting |

Problem in a nutshell: Historical data about delivery pace (throughput or velocity) contains work that isn’t part of the original backlog causing it to be a poor predictor of future delivery pace of planned work. Should we just capture planned items? Or Defects and un-planned work as well? Yes, capture it ALL, but be selective what data to use when forecasting.

Historical data is important for forecasting. We need to capture how the entire system has operated historically to avoid being misled by our own cognitive and wishful biases about how the future might unfold. The issue arises how we choose to use that data when not all of it is applicable for some reason. Defect work and un-planned work are common reasons some data should be omitted when forecasting some measures.

A common issue and most common question I get asked when using historical data is “should we count defects or interrupt (non-feature) work?” My answer is always Yes, you should capture every bit of data you can if its passively attainable (meaning, little or no extra effort). This does not mean you should use it all when forecasting.

When capturing item data, here are the ones I think are a must –

  1. The date that item was committed to being work on. This is a more important date than the date it was created in any tool, it might be the date it was pulled into a committed queue, or the date it was put into a sprint backlog.
  2. The date the item was delivered (ideal) or completed ready for delivery (ok, and most common).
  3. Whether the item was planned or un-planned work. This varies. My most common categories are planned, un-planned, defect.

These inputs are the three our free Throughput and Cycle Time Calculator use which you can download it here (Excel spreadsheet, no macros)

The data inputs used in our throughput and cycle-time calculator.

The data inputs used in our throughput and cycle-time calculator.

This spreadsheet allows you to customize what value in the original system maps to planned and un-planned work in the spreadsheet. It doesn’t handle both un-planned and defect, so not perfect and it will be updated to do both.

The setting tab allows customizing what data categories are planned an unplanned.

The setting tab allows customizing what data categories are planned an unplanned.

Different type of items help estimating and forecasting different things. Here is how we suggest using the data from our spreadsheet:

To forecast how long to deliver the rest of the feature or project backlog: Use the Planned item throughput adjusted by the observed split rate.

To estimate the planned item split rate: Use the amount of original planned backlog count divided by the planned completed backlog count (see our article on this here).

To estimate defect rates: Use the Defect items throughput rate. To calculate how many defects per planned item, divide the defect throughput by the planned throughput.

To estimate the interrupt or un-planned work rate or cycle times: Use the un-planned items throughput and cycle time.

A different throughput (items per week/sprint) shown for planned and un-planned work. Forecasting planned work? Use the planned throughput!

A different throughput (items per week/sprint) shown for planned and un-planned work. Forecasting planned work? Use the planned throughput!

Wanting to give a percentile SLA for unplanned work? Use the un-planned cycle-time, it may be longer or shorter than planned items, never the same!

Wanting to give a percentile SLA for unplanned work? Use the un-planned cycle-time, it may be longer or shorter than planned items, never the same!

Knowing what percentage of your work is planned or un-planned helps see where work is coming from over time. Confirming the mix is what you expected helps manage the delivery system.

Knowing what percentage of your work is planned or un-planned helps see where work is coming from over time. Confirming the mix is what you expected helps manage the delivery system.

Its important to understand how different items are ordered and prioritized by the teams through intentional policy. Making sure everyone knows why (and when) some types of work should be accelerated keeps the right work finishing earlier. Keep collection as light as possible, and make the team invested in proper classification and sign-off of the key dates by making the data accessible for process improvement and discussion. A good outcome of these discussions is a proposed change in policy to keep the most important items completing when needed.

Get the Throughput and Cycle Time Calculator spreadsheet we used in this article which you can download it here (Excel spreadsheet, no macros)

(its free, and uses no macros so its generally safe. Please consider answering the survey and sending us anonymous data so that we have a wider set of data to share with researchers.)

Read More

Dealing with the Close-out phase – When delivery is held hostage by a few un-finished items

Posted by in Featured, Forecasting |

Problem in a nutshell: During development of a feature or project work can often (although not always) be delivered faster with more people or teams. At some point in a project, most often near the very end, a few defects or missing features remain unfinished to a releasable quality. These items block the release, but CANNOT be accelerated by adding more resources or cutting scope (they are part of the minimum viable set of features). To forecast delivery date of this phase of a project or feature, you must consider the longest individual item forecast, not the backlog driven probabilistic forecast.

Near the end of a delivery cycle work is blocked by a few remaining items, either defects or agreed necessary completion of features. Much of the delivery system will appear idle at this time but these items don’t reduce time with added people. They are staffed to the necessary level and all viable ideas for accelerating them have been implemented. This phase of a delivery I call “Close-out.” Its where how much work remaining matters less than how long the longest single remaining item will take.

I find probabilistic forecasts (eight out of ten of mine) reliably predict a completion date EXCLUDING this phase. When i’m wrong, i’m wrong because i’ve not adequately managed or modeled this part of a project or feature delivery. My only mitigation is to minimize or avoid this phase by ensuring done means done, and by putting effort into excluding risky outcomes earlier in the project. Its a significant issue especially for large installed applications and user bases where regular releases still aren’t as common as they should be.

How to tell you are entering the Close-out Phase

First step in understanding the close-out phase is recognizing when you are entering it. I look for a point where work-in-progress for the entire team or organization decreases below its 75th percentile. A key point of close-out is that the number of people who can work on the remaining items are at saturation levels, so the remaining people help, but sit idle. In reality, its not that simple to see. Idle people aren’t tolerated (although they should, but that’s another blog post), they move onto the next release or do technical debt removal or needed refactoring. This is healthy, but not pertinent to the current deliverable. I exclude these and look for the actual work in progress at an item or person level for the deliverable held hostage. When this is down 25% from its highest, I assume if 1 in 4 people are doing other stuff, then the system is saying we are time-based not scale-based in throughput. I use this 1 in 4 people to manage and mitigate future close-out risk.

How to manage the Close-out phase

I capture a list of all of the remaining items blocking release. Using the ¼ of the people NOT directly involved in the close-out items but still allocated to the project, i give them the mission of brainstorming items that still may crop up (to avoid worsening the problem), and to be available on request for immediate code reviews and testing the close-out items. I also collect ideas as to how these items may have been detected earlier to minimize this impact earlier. Ideally, this brainstorming should be constantly performed be all team members.

Close-out item causes and mitigation

Forecasting in advance is difficult, the best process minimizes or avoids these close-out items be looking and dealing with them earlier. When there is no data for the proposed project I look at other projects this team or organization has done.

I particularly look for external players who can block a release. My most common reasons are –

  1. External parties have a final go or no-go decision, and they give late feedback
    • Know who has the final go and no go decision – if its external incentivize them to help earlier
    • Get sign off as early as possible; give these people previews and hand-hold them
    • Learn how they will test and decide the system is “good,” for example, get them to supply you with THEIR test data and cases during development
  2. Performance testing is performed at the very end and the feature fails required performance.
    • Agree what impact is acceptable for the new feature or project from the prior
    • Build enough of the system to do performance testing early and retrofit changes earlier if it fails
  3. Edge Cases. Browser versions (Internet explorer in the old days), operating system updates (android and iOS particularly)
    • Learn what browsers and operating systems are essential. Ask sales and marketing what the top 10 customers use and make sure you do these!
  4. Localization and Internationalization
    • Late changes to text means late response to language translations. Get FINAL text signed off early.
    • Don’t wait to the end of the project for someone to check German number and date formats!
  5. Defects
    • Keep defect count low. Keep things at known point of stability, and encourage that point be releasable (thanks Michael Tardiff for this terminology)
    • Rigorously get defects discovered. Schedule bug bash days to shake out the defects early. Celebrate finding defects, and then get them fixed. Biggest risk is that you just think you are in a known stable and releasable point.

Sitting down with a team early and looking for items such as these and avoiding them is the only way to minimize the final impact and allow forecasting to be reliable. Do brainstorming session specifically on these items often by asking the simple questions “What could stand between us and release of this product or feature tomorrow?” – its my standard stand-up question, or the first question I have for anybody I meet in the hallway.


Read More

Calculating Work Split Rate – Ground-Speed vs Air-Speed

Posted by in Featured, Forecasting |

Problem in a nutshell: Work is often split before being developed by a team. The count of work completed is post-split, the count of work still in the backlog is pre-split. If the rate of completion (or velocity) doesn’t account for this split rate, 2x or 3x errors often occur in any forecast using this data. Put simply, forecasts not accounting for split rate forecast early by 2x or 3x – a major miss. (the actual error will depend on how granular your original backlog is, but we see split rate averages of 2.5).

I see this issue in just about every tool vendors attempt at Monte Carlo forecasting using data. They use the throughput rate computed as the count of items completed, and use a Monte Carlo forecast based on the count of items remaining in the backlog. This is a serious error. The two counts are essentially different units of measure.

Definition: Split Rate –  The ratio that backlog work grows whilst being developed. Expressed as a low estimate to high estimate range. For example, 2 means that for each one item in the original backlog, there are two in the completed item list as counted throughput.

Rough analogy: Pilots are aware that their speed over the ground is impacted by head-wind (or tail-wind). They can be flying at 100 knots indicated air speed, and with no headwind they would be traveling 100 knots over the ground. But if they were traveling into a  25 knot headwind, they would only travel 75 knots over the ground even though their instruments might say 100 knots air-speed. If you are a passenger on this plane, hope they added 25% more fuel. Work items splitting in projects suffers the same issue. We often need to use more fuel (do more) in order to deliver all of our planned work (i’m inventing delivery speed to represent this). Our air-speed might read high because we split items into two or three before working on them (we record 2 units of air speed for every 1 unit of delivery speed), or we have defect rework (we make no delivery progress, but we count effort as air-speed). For us to be able to forecast project, we need to know what the ratio of air-speed is to delivery progress. We call this work item split rate.

Estimating the Split Rate Distribution

Splitting is anything that makes a single item in the backlog become two or more items in the completed list. It won’t be the same for every item. Some items won’t be split at all, and some will be split many times. For reliable forecasting, its just as important to estimate the range of split rate as it is the range of story counts. Using an average will cause the same Flaw of Average errors for split rate as any other random input into a model, so our tools use a probability range, most often a simple uniform distribution between a low and high estimate.

Consider what factors will cause splitting in your context. Here is my list of “Things that cause splitting or scope growth” –

  1. Defects. Often a defect is reported when doing the work. If defects are counted in throughput or velocity, then they need to be compensated by increasing split rate. My suggestion is to NOT count defects in throughput or velocity records. They still occur, but our rate of progress is pure “delivered value work.”
  2. Splitting. Work when looked at by the team gets split into multiple smaller parts. This won’t happen for small items in the backlog, but it will happen with the bigger items. The distribution of higher split rates should correlate in some way to work item size estimates (mostly). I’m not saying estimate, but a quick Small, Medium or Large might help identify how many items might be split.
  3. Discovered Work. By doing work often new work is discovered. This growth will often be higher at the beginning of a project, and taper off later. Don’t be alarmed if its huge to begin with. My rule of thumb is it will decrease b half at the half way point of the project, and halve again by the later 1/3. I measure for the first half of the project by recording what the initial backlog count was at the start of the project, and at the half-way point (half the original scope completed). The ratio will be equal to the original count / (count in the completed list – defects)

At the start of a project where no data exists, you still need an estimate to do planning. My rule of thumbs without any data is a uniform range 1 to 3. Here is my process to confirm this is close enough until real data flows –

  1. One is my go-to low-bound. I know that some items will not be split at all. If its a major investment, I get the team to estimate 10 items from the backlog at random. I expect 3 to 4 of them to be 1 point stories or “small.” If its less than that (zero, one or two) I set my low bound estimate split rate to 2.
  2. Three is my go-to high-bound. Again, if investment is major, i look at the 10 random estimates i got the team to do and expect that 3 to 4 items will be 8+ or “large.” If more are large, i bump up my high split rate to 4 or 5. Often this has significant forecast impact, so i need to confirm. I pick a few of the “large” stories and get the team to split them. I take the average of the test splits.

If done well, this take less than one-hour to get a solid grounding of likely split rate. Start with the base rate of 1 to 3, and adjust based on your context.

When real data starts flowing pay particular attention to splits. Look to make your estimate range match actual. Beware of the early outliers though. Sometime the early work is the most technically risks, and high splits are common. Exclude anything that happens just once in every ten items.

Also beware of the “things” that make splitting look bigger than it is –

  1. Some work is abandoned. After the team splits the work, a story or two are left very low priority and shipping occurs without them. I’ve seen this as high as one item for every piece of original backlog. If i see this, i drop 1 from the high estimate.
  2. Minor defects where the risk of fix is higher than leaving them in place.
  3. Carried over stories. Sometime work is closed at the end of a sprint and a new story created. This is an important thing to consider properly, but make sure your realize its happening. If forecasting using cycle-time, rather than throughput – this is already factored into the model, so counting these would double-count.

Split rate is a complex beast to consider. We have included it as an input to our free spreadsheet forecasting tools (Single Feature Forecaster: I want to forecast how long a single feature may take and Multiple Feature Cut Line Forecaster: I want to forecast multiple features at once) and see that many vendors in the industry haven’t considered this issue to the peril of their customers.



Read More

Time to Productive: Estimating New Staff On-boarding & Training Impact

Posted by in Featured, Forecasting |

Problem in a nutshell: New employees aren’t immediately trained to do their work in your context independently. It takes time for them to ramp up towards independence. Until independence, they absorb other team members time and cause negative impact to team performance overall. Estimating when new team members will be a net positive on project delivery performance is an important part of planning team size and how fast progress will proceed after they acquire the necessary knowledge.

I often find that teams and organizations don’t understand how fast new hires will be absorbed into a team. Without a sense of time to productive, its hard to predict how team size relates to estimated delivery impact.

The only certainties in new hires are –

  1. Its unlikely (unless its really simple and common work) that new hires hit the ground immediately net positive impact to the team
  2. Its likely that during ramp up, the senior teacher level members of the team will be adversely impacted
  3. Its likely that the time to ramp up a new hire will be different for each skillset required

This post explains how to estimate and research the questions surrounding staff skill risk and on-boarding effort.

Skill Capability Surveys and Team Planning

The general process to determine risk and readiness for a team to deliver is –

  1. Capture a list of specific skills a team needs to perform
  2. Survey the current staff to determine what skill levels (see next section) they are for each skill
  3. Solve immediate risks where there are zero or just one person who can do a skill. Train or hire urgently.
  4. Balance the demand and supply of people who can do that skill
  5. Plan how teams can be split and grown into more teams to increase growth

Its important to know what skills are needed for a team or organization. A skill is any particular knowledge (or access) people need  to do their jobs. Coding is one skill, but this might have some particular sub-skills, Java, Angular.js, CSS for example. If people need a different set of tools and knowledge, this is a specific skill. Its also important to know how many people can mentor and teach others, and for this we need to look at the skill level of our current staff.

Staff Skill Levels – Not how good they are, how well they teach others!

I stay away from how good someone is at a particular job from a performance perspective. I want to know how good they are at teaching others. Some people seem concerned with productivity, and look for coders who are 10x productivity of others, but i’ve not seen this significant from a team throughput perspective. Even if one part of a development and delivery process gets done 10x “faster” its unlikely to yield a significant impact on total system throughput, as work hits a constraint and sits idle (rushing into a waiting room). Forecasting involves taking a wider systems view and looking for gaps in skillsets that will become system constraints. Its more important from a forecasting perspective to look for gaps and constraints rather than “excellence,” the constraints are where improvement opportunities are a plenty and its exponentially powerful (rather than 10x being linear improvement).

I capture three levels of skill ability that help plan teams and see risks. Capture how many people you have for every required skill –

  1. I can teach others (Teachers)
  2. I can do independently, and know when to ask questions (Do’ers)
  3. I’ve got the ability to do the work if taught (Learners)

Thinking of teams in this way helps ensure that you aren’t constrained by people who can do the work independently (Do’ers) or that you have an absence of people who can mentor, teach and help team members when that skill is used (Teachers). New hires enter the team at level 3, Learners. We need to understand that we have a teacher available to bring them up to Do’er level, and we need to make sure we never have just a single Do’er to avoid a single point of failure, bringing the system to no flow if that person leaves, goes on vacation, or is busy doing other work.

Time to Productive & Maximum Teach Load

Each skillset wanted in a new hire will take time to develop. Someone will be needed to mentor and teach the new employee on how to do that skill (or more commonly, how that skill is done “here”). During this training time, expect the teacher level employee to decline in productivity. This is a careful trade, and the subject of writing in the legendary “Mythical Man Month,” where Fred Brooks explains clearly why adding people late to a late project makes it later. Absorbing the current teams time to get new members up to an independent level takes effort.

What I do is keep two estimates up to date for each skillset hire for –

  1. Time to Productive: How long from hire to independent ability to work mostly alone. This isn’t time to replace the teacher level members, its time to know when to ask a question!
  2. Maximum Teach Load: How many new team members can be on-boarded at the same time. Going above this impacts the time to productive adversely. Most often this number is one. When growing a team, it is often faster to stagger the hiring so that the teachers are saturated. Ideally, teaching should take < 50% of their time.

I use these estimates when forecasting. I don’t assume any benefit from new team hires until “Date Hired + Time to Productive” and I might even degrade a teams throughput estimates or historical performance during these on-boarding periods if > Maximum Teach Load is exceeded. I try and get the teams to slow down hiring first though – ramp up hiring, not one shot. I sometime even loan the new hire to another team and get them to teach the new team member if its got available teacher level staff. This can accelerate the hiring plan.

Capability Matrix Tool

Capability Matrix: I want to analyze skillset risk of a team

Our capability matrix spreadsheet helps analyze and survey what skills are on hand, and how many independent level and teachers are available. Its a totally configurable survey form that lets current team members self assess their level of expertise on each skillset, and also documents who is willing to “also” learn another skill to a higher level of expertise. Its NOT the goal to have everyone at a “Teacher” level – its the goal to know that you have gaps in skillset coverage, and plan your options to solve these gaps.

The capability matrix works for me a multi-team organization level where i’m looking at what team does what. I first confirm that we don’t have a team constraint, where there is just one team who will do all one type of work, or no team doing one type of required skill/work. One i stabilize this, i move onto the team level skillsets and formulate a stabilization plan. This plan is what skills we need to develop internally or hire, and more importantly, a timeline that allows the organization and teams to absorb this ramp-up. Too fast and we depress delivery capability too deeply, too slow, and we suffer high risk of failure (to few people who can do one type of job or skill). Its a balancing act.

Get the spreadsheet here: Capability Matrix: I want to analyze skillset risk of a team

staff capability survey

Staff capability survey. Old school survey thats quick to capture the necessary details.

Staff capability analysis

The main capability survey data is captured in this worksheet. For each team member or external team, a level of current expertise is captured.

Staff capability guidance

A guidance worksheet shows our prioritization and tips for solving staff skillset risk.















Read More

Nailing Down a Start Date

Posted by in Featured, Forecasting |

Project start dates are assumed an easy input. I often see too little thinking go into this date. Its actually one of the most important inputs because all other values are based from this date. A day off is at least a day wrong in the forecast without creating team stress. The most common reasons this date turns out incorrect (different than planned) are –

  1. Its a date based on the “planned” finish of another project and that project goes long
  2. The team intended to start the new work is still performing release activities or stabilization of a production tool
  3. People are exhausted after a hard slog o a prior project
  4. The team doesn’t exist yet, or in a full set of required skills
  5. They don’t have the tools or environments to begin development
  6. The project isn’t approved to go ahead yet
  7. The team isn’t trained in the skills it needs

These are my shortlist. They have little to do with this project or feature, and a lot to do with what else is happening inside the company. To get this date right you have to boldly ask difficult questions of other teams. You have to look at how similar projects have started and assume you will fare no better without taking a different approach.

Estimate this date to the best of your knowledge considering that the following conditions are met (make your own list and tick them off as you go) –

  1. Do i have a core team with the skills required?
  2. How long will i take to interview, hire and onboard new staff?
  3. Do we have physical seating space for the team?
  4. How long does it take to get access to critical IT resources (database, websites, tools)?
  5. What software and tools are needed for them to do their job, do we have it on order?
  6. What other projects or production issues might my team be “borrowed for”?

You get the idea. Look for things that will delay the team hitting its delivery stride. Either set the date past that estimate, or assume a slower throughput in the early days. But most importantly – don’t just use the “desirable” start date for forecasting.


Read More

Monte Carlo Forecasting Exercises

Posted by in Featured, Forecasting, Tools |

Monte Carlo forecasting is simpler than most people believe. To prove this we teach the basics using paper tutorials and dice. We make these tutorials publicly available for you to use to teach others within your company (or at home with your kids).  We welcome all feedback on what works and what doesn’t.

Basic Monte Carlo Forecasting Tutorial

Get the Basic Monte Carlo Forecasting Tutorial here.

“Discover what Monte Carlo forecasting is by performing it by hand. This exercise simulates completing a project many times and plots the outcomes. Perform 9 more trials. Each trial involves filling all rows in a column until the remaining work count reaches zero.”

This tutorial is the simplest burn-down style Monte Carlo forecast. It simulates 10 trials starting at 20 stories being completed at a rate determined by a roll of a six-sided dice. The burndowns are plotted on a chart to demonstrate that each trial takes a different route to completion.

Basic Monte Carlo Exercise

Perform ten Monte Carlo forecast trials and then chart the results.

Basic Monte Carlo Chart Sheet

Chart the burndown results of ten Monte Carlo forecast trials

Throughput Monte Carlo Forecasting Tutorial

Get the Throughput Monte Carlo Simulation Tutorial here.

This is a more advanced tutorial. It has extensive instructions on its opening page, and a completed example as the last page. We use this in our training courses, and often play this with all staff during our forecasting engagements. It get students to perform 3 of 10 simulation trials (7 are already performed) using historical throughput samples (given to you). It walks the students through the mathematics of computing percentiles on result trials to help people understand how to interpret 85th percentile for example.

Throughput Monte Carlo Forecasting Tutorial

Throughput Monte Carlo Forecasting Tutorial

There are many teaching moments during this tutorial. The most common is why is all this “complexity” necessary. We answer that in the document –

“Historical throughput data for teams measures delivery rate for a wide portion of the development system (the wider the better). Team throughput per week accounts for delays; for example waiting time, impediments, staff availability, interruptions and un-recorded work. The impact of these delays is more significant to a forecast than the hands-on time alone. This is a reason developer estimates are unreliable when forecasting projects, they don’t account for delays and system dynamics. In a stable system (e.g. the team isn’t blown-up), throughput will be a good predictor of future delivery rate even with large item size variability.”

We will add more of these over time as we learn ourselves how to teach the methods we use and why they give superior results than dividing average rate of work delivery by an average amount of work remaining.


Read More