Hacker News

vslira 15 hours ago [ - ]

Hm, that’s a multinomial classification with a very high cardinality. It’s really weird it works. I’m sure it does as the author states, but for how many authors (out of the whole web) does this work?

dmd 13 hours ago [ - ]

It worked on me, and I would be shocked if my blog (dmd.3e.org) has more than a dozen readers. I am stunned.

skeledrew 12 hours ago [ - ]

It's not about the readers, just the fact that there's enough of a sample that it can use, with sufficient differentiation from other content.

dmd 12 hours ago [ - ]

I’ve posted on average 3 things a year.

londons_explore 7 hours ago [ - ]

There are ~8 billion people. Sounds big, but it's only 2^33. Ie if you can find 33 things about the text which halve the number of possible writers, you have narrowed it down to 1 person.

Just a couple more things and you can accommodate some of your things being mistaken/wrong/uncertain too.

kelseyfrog 15 hours ago [ - ]

Sure the cardinality is high, but the model isn't using a uniform prior. What do you suppose all the the values in each of the terms are, P(Text sample | Kelsey Piper) * P(Text sample) / P(Kelsey Piper)?

astrange 14 hours ago [ - ]

Maybe it just says all writing is Kelsey Piper.