I have a copy of the shorter Oxford English Dictionary from 1970 which I inherited. It is two massive volumes and is only shorter in comparison to the full dictionary which is 12 volumes (more in more modern editions).
My shorter OED contains 163,000 words (compared to the 600,000 words of the longer).
According to this site I know 71,000 words... Let's test that against the OED. I should have about 43% chance if knowing a word picked at random.
In my totally scientific test (ha) I chose 50 words at random from the OED and discovered I knew 29 of them for a score of 58% which is more than two sigma from 43%, this disproving the hypothesis.
I forgot what that was now, but it was a fun experiment.
I also got something around 70-80k with 95/100 correct words (I don't know or use most of these words, but the later sections have a lot of words with Greek or Latin origin, which made them easy to guess). One of my wrong words was a misclick in the first section, which I think dragged down the estimate quite a lot. You may have done something similar. I assume they use a simple formula where early misses cost you a lot and late misses cost you very little.
can't assume gaussian underlying distribution of the word-knowing, it's known zipfian. so you can't be doing anovas or anything of that nature because if you look up zipfian distribution's variance, you get Nature and Reality giving you the middle finger
I think you mean it's lognormal, at least if we're discussing native English speakers or comparing those with similar amounts of exposure to the language.
(The median English speaker almost certainly knows several thousand words, or word stems to avoid duplication. But the number who know all words in the tail is exceptionally small.)
No way is vocab size zipfian. Word counts from a corpus follow zipf's law, but not vocab sizes themselves.
Otherwise the most common vocab size would be equal to one.
Not to mention, N=1
Neat way to validate.
Your method of sampling could be improved further, unfortunately at the expense of ease of use. If the dictionary was sorted according to difficulty, then you could use stratified sampling.
I comment on the related aspects here.
https://news.ycombinator.com/item?id=48599769