Hacker News

I wouldn't say it sucks. You just need to keep training it for as long as needed. You can do adversarial techniques to generate new paths. You can also use the winning human strategies to further improve. Hopefully we'll find better approaches, but this is extremely successful and far from sucking.

Sure, Go is not solved yet. But RL is just fine continuing to that asymptote for as long as we want.

The funny part is that this applies to people too. Masters don't like to play low ranked people because they're unpredictable and the ELO loss for them is not worth the risk. (Which does rise questions about how we really rank people)