> A nicely aligned table with ANSI colors is for humans. An agent extracting a post ID needs JSON.

Wrong. While table formatting can confuse an LLM in some cases, a natural language output in pure text is almost always better than JSON for small amounts of data. After all, LLMs have more natural language training data than JSON training data.

The fallacy that LLMs need machine readable outputs just because they're machines is pervasive and it's a huge misconception about the way these models work.

On the other hand, I agree that large amounts of data should be outputted in a machine readable way so that the LLM can run scripts over it for more advanced parsing.

Agree with JSON. But, surprisingly html and latex perform slightly better than markdown for more complex tables.

Check out this paper - https://arxiv.org/abs/2506.13405

I totally agree with what you are saying here and it’s really confusing to me why anyone would think that json is a good format for LLMs. There’s so much redundant text in json. LLMs don’t need that and my experience is that as the document gets bigger it actually hurts the LLM.

I don't disagree, but I'm wondering if there's any evidence of this available.

> After all, LLMs have more natural language training data than JSON training data.

While that is true, data also doesn't usually look like natural language (i.e. a collection of financial records). And when it does (i.e. a collection of chat messages), I wonder if it's more confusing if it's unstructured, even if small.

I expect most frontier models to handle these cases just fine either way, so it may largely depend on context—specifically, how much there is, and where the attention shakes out. Ultimately, a claim one way or the other, for something this context-dependent, would have to be backed up by a lot of testing and would probably conclude that, "in most cases, you should do this"

Yes and no. The LLM that sees a JSON structure can decide to use tools to extract and format data as needed, whereas it cannot do the same with natural language.

The Unix philosophy of small, composable tools is still valid in the era of stochastic machines!

As I said, I agree with you in the case of big outputs. But for small outputs, tool calls can be reliably created from the NL version. There's no need for JSON.

[dead]