This was an obvious next step. Most current products can only restrict the token prediction to valid JSON or a specific JSON schema at best. There's no reason that this should be the only grammar available for constrained output mode.
The real challenge will be to make this detect and switch languages automatically. For example, a snippet of code could include a LaTeX formula in a comment and SQL in a string literal. There are many more examples, such as regex inside a shell script, and so on.
The obvious next step after that is back-tracking. It's possible to emit a token that is valid, but then allows no further completions that are valid. In other words, the model can paint itself into a corner. To my knowledge, no current online LLM service uses any kind of backtracking, they run in append ("forwards") mode only.
SRLCG: Self-Rectified Large-Scale Code Generation with Multidimensional Chain-of-Thought and Dynamic Backtracking
https://arxiv.org/abs/2504.00532
IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
https://arxiv.org/abs/2410.07295
ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Models for Code Generation
https://arxiv.org/abs/2411.07112v1
Another one: SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking https://arxiv.org/abs/2306.05426
There was also an hn thread: https://news.ycombinator.com/item?id=36425375
I believe Microsoft introduced a framework that did this sort of backtracking that you're suggesting. I'm not sure how much traction it got.
Backtracking idea is interesting, could maybe diffusion help? At some point it turns into sat solving.
Sat solving I guess because types encode proofs?
re detecting and switching language: you could run several constraint systems in parallel and switch as soon as one of them rejects the input and another accepts it
re backtracking: a core part of this paper is ensuring a prefix property. that is there is always a legitimate completion and the model can not "corner" itself!
research needs to be done for what kind of languages and language features this prefix property can be ensured