Text2SQL was 75% on bird-bench 6 months ago. Now it's 80%. Humans are still at 90+%. We're not quite there yet. I suspect text-to-sql needs a lot of intermediate state and composition of abstractions, which vanilla attention is not great at.
Text2SQL was 75% on bird-bench 6 months ago. Now it's 80%. Humans are still at 90+%. We're not quite there yet. I suspect text-to-sql needs a lot of intermediate state and composition of abstractions, which vanilla attention is not great at.
Text to sql is solved by having good UX and a reasonable team that’s in touch with the customers needs.
A user having to come up with novel queries all the time to warrant text 2 sql is a failure of product design.
This 1000x. I’ve sat through several vendor demos of BI tools that have a chatbot and seen my PM go all starry eyed that you can ask it “show me top x over the last week” and get a chart back. How an empty text box is easier to use than a UI with several filter drop-downs, I’ll never understand, and I suspect that the people impressed with this stuff don’t know either.
This is exactly it. AI is sniffing out the good datamodels from the bad. Easy to understand? AI can understand it too! Complex business mess with endless technical debt? Not too much.
But this is precisely why we're seeing startups build insane things fast while well established companies are still questioning if it's even worth it or not.
There were some iffy things about the text to SQL datasets though, historically.
People got good results on the test datasets, but the test datasets had errors so the high performance was actually just the models being overfitted.
I don't remember where this was identified, but it's really recent, but before GPT-5.