The bug in that paper is actually the very buffer overflow bug I was referring to. Given that Jon himself made that error in the "definitive" implementation, it seems unlikely that he would have spotted it in the 10% of implementations he considered correct. Under Google's stricter criteria it seems likely to me that not a single person got the search implementation truly correct enough.