How data teams can get better at estimating lead times

To get better at estimating lead times, you need to change how you provide your estimates. You need to identify the key risks and figure out how to test them early. 

Estimating how long it takes to deliver an analysis can be, and often is, very difficult to do. This is especially true where the projects are complex, novel and ad-hoc. 

I cover this topic in my book, but the total time it takes to complete a task is made up both of the task duration and the lead time.

I had an analyst reach out to me and asked if I had any tips for getting better at estimating lead times, given that longer-term projects can run into all sorts of roadblocks. Getting access to the data, data quality, weird coding bugs, interpreting results – just a few of the unknowns that can crop up. 

The usual way you provide estimates (and how people usually want them) is for you to give an on-the-spot timeline for how long it will take to do something. Maybe you’re able to ask a few more questions but you’re still left with a lot of uncertainty. 

The way to change this is to think about the key assumptions you’re being asked to make and how you can test them early. For example, let’s say you’re asked to do an analysis using a datasource you’ve never worked with and you’ll likely need to use a modelling technique you’re unfamiliar with. 

Before kicking off a “one month” analysis (that turns into a 6-month-long project), it would be worth taking a couple of days to assess the data quality and research the modelling technique.

Is the data standardized, documented and up-to-date? Is the modelling technique widely used, has a ton of reference material, many elements you’re already familiar with, loads of stack-overflow solutions and examples?

The potential roadblocks now pose very little risk, and you can give a more confident estimate of a shorter time frame. 

What if the opposite happens? Is the data quality all over the place? Looks like it’s being generated by disparate systems? You noticed the latest records are very old? There’s a lot of confusing variables and no documentation? 

And the modelling technique you’ve found is not great for the type of data you have, even if it were cleaned? You can only find obscure references to it and the examples are all in software you’re not at all familiar with? 

The major roadblocks are now very clear and you can more confidently estimate a much longer timeline. You can think of the steps you’ll need to overcome them and what alternatives there may be, ideally in consultation with the stakeholder. 

Identify the key risks, test them early. 


Keep up to date with new Data posts and/or Big Book of R updates by signing up to my newsletter. Subscribers get a free copy of Project Management Fundamentals for Data Analysts worth $12.

Once you’ve subscribed, you’ll get a follow up email with a link to your free copy.

Back to Top