Hacker News

I would assume it's important to know what's in that training set too

Because I get reliable generation out of "niche" languages already

Is it code with lots of SQL injections used in a different domain to your own?

It's maybe not good to conflate quantity with quality