Hacker News

noperator 18 hours ago [ - ]

Having some success while testing this model out as a replacement for GPT-5 nano in source code security review. Running on RTX 3090 (24 GB VRAM) via vLLM. It's not great on structured output (as noted in the model card) but I'm working around that in my harness.

dummydummy1234 18 hours ago [ - ]

Can't you just force it to do structured output via constrained generation?

noperator 5 hours ago [ - ]

Yes, I did end up figuring out a clean way to allow normal reasoning inside <think> and then force JSON _after_ the closing </think>. Example here: https://gist.github.com/noperator/6c711ab19027ea8056442df839...

hypfer 16 hours ago [ - ]

> but I'm working around that in my harness.

How?

uberex 11 hours ago [ - ]

Maybe limiting logits to what is syntactically correct? E.g. {"hello" has to be followed by whitespace or colon. Any other logits get dropped.