Hacker News

The best performing isn't markdown tables, it's markdown key/value pairs:

  ## Record 1
  
  ```
  id: 1
  name: Charlie A0
  age: 56
  city: New York
  department: Operations
  salary: 67896
  years_experience: 7
  project_count: 1
  ```

Which makes sense to me because the problem with formats like CSV and regular markdown tables is that it is too easy for the model to mistakenly associate a value in a row with the wrong header.

Explicit key/value formats like this or YAML or JSON objects make that a lot less likely.

cwmoore 2 days ago [ - ]

I was surprised that XML (56%), with closing tags, wasn’t as good as YAMl/KV(60%), though line breaks perform the same kind of grouping function.

Then I realized from the table that XML used about 50% more tokens (~75K vs ~50K) for similar accuracy, and for the first time felt a kind of sympathy for the LLM…

svachalek 2 days ago [ - ]

Yeah that was my intuition as well. I think the KV-Markdown format gains additional advantage over JSON and YAML in the special syntax for headers helping to break up records.