I strongly agree with this, and particularly point 1. If you ask people to provide estimated ranges for answers that they are 90% confident in, people on average produce roughly 30% confidence intervals instead. Over 90% of people don't even get to 70% confidence intervals.
You can test yourself at https://blog.codinghorror.com/how-good-an-estimator-are-you/.
From link:
> Heaviest blue whale ever recorded
I don't think estimation errors regarding things outside of someone's area of familiarity say much.
You could ask a much "easier"" question from the same topic area and still get terrible answers: "What percentage of blue whales are blue?" Or just "Are blue whales blue?"
Estimating something often encountered but uncounted seems like a better test. Like how many cars pass in front of my house every day. I could apply arithmetic, soft logic and intuition to that. But that would be a difficult question to grade, given it has no universal answer.
I have no familiarity with blue whales but I would guess they're 1--5 times the mass of lorries, which I guess weigh like 10--20 cars which I in turn estimate at 1.2--2 tonnes, so primitively 12--200 tonnes for a normal blue whale. This also aligns with it being at least twice as large as an elephant, something I estimate at 5 tonnes.
The question asks for the heaviest, which I think cannot be more than three times the normal weight, and probably no less than 1.3. That lands me at 15--600 tonnes using primitive arithmetic. The calculator in OP suggests 40--320.
The real value is apparently 170, but that doesn't really matter. The process of arriving at an interval that is as wide as necessary but no wider is the point.
Estimation is a skill that can be trained. It is a generic skill that does not rely on domain knowledge beyond some common sense.
I would say general knowledge in many domains may help with this as you can try and approximate to the nearest thing you know from that domain.
How you get good at being a generalist is the tricky part, my best bet is reading and doing a lot of trivia (I found crosswords to be somewhat effective at this, but far from being efficient)
No, that has nothing to do with it. Trivia helps you narrow down an interval. It is not necessary to construct a correct interval, which can be of any width.
I am dying inside imagining someone who had to use crossword puzzles to learn how to read. There must be a better way to educate the masses!
I guess people didn't realise they are allowed to, and in fact are expected to, put very wide ranges for things they are not certain about.
So the context of the quiz is software estimation, where I assume it's an intentional parable of estimating something you haven't seen before. It's trying to demonstrate that your "5-7 days" estimate probably represents far more certainty than you intended.
For some of these, your answer could span orders of magnitude. E.g. my answer for the heaviest blue whale would probably be 5-500 tons because I don't have a good concept of things that weigh 500 tons. The important point is that I'm right around 9 times in 10, not that I had a precise estimate.
I don't know, an estimate spanning three orders of magnitude doesn't seem useful.
To continue your example of 5-7 days, it would turn into an estimate of 5-700 days. So somewhere between a week or two years. And fair enough, whatever you're estimating will land somewhere in between. But how do I proceed from there with actual planning or budget?
> But how do I proceed from there with actual planning or budget?
You make up the number you wanted to hear in the first place that ostensibly works with the rest of the schedule. That’s why engineering estimates are so useless - it’s not that they’re inaccurate or unrealistic - it’s that if we insisted on giving them realistic estimates we’d get fired and replaced by someone else who is willing to appease management and just kick the can down the road a few more weeks.
Your question is akin to asking ‘how do I make the tail to wag the dog?’
Your budget should be allocated for say 80% confidence (which the tool helpfully provides behind a switch) and your stakeholders must be on board with this. It shouldn’t be too hard to do since everyone has some experience with missed engineering deadlines. (Bezos would probably say 70% or even less.)
I mean it's no less useful than a more precise, but less certain estimate. It means you either need to do some work to improve your certainty (e.g. in the case of this quiz, allow spending more than 10 minutes or allow research) or prepare for the possibility that it's 700 days.
Edit: And by the way given a large enough view, estimates like this can still be valuable, because when you add these estimates together the resulting probability distribution narrows considerably. e.g. at just 10 tasks of this size, you get a 95% CI of 245~460 per task. At 20, 225~430 per task.
Note that this is obviously reductive as there's no way an estimate of 5-700 would imply a normal distribution centred at 352.5, it would be more like a logarithmic distribution where the mean is around 10 days. And additionally, this treats each task as independent...i.e. one estimate being at the high end wouldn't mean another one would be as well.
It shouldn't matter how familiar you are with the question. If you're pretty familiar, give a narrow 90% credence interval. If you're unfamiliar, give a wide interval.