It kinda blows my mind that after XML we've managed to make a whole bunch of stuff that's significantly worse for any serious usage.

JSON: No comments, no datatypes, no good system for validation.

YAML: Arcane nonsense like sexagesimal number literals, footguns with anchors, Norway problem, non-string keys, accidental conversion to a number, CODE INJECTION!

I don't know why, but XML's verbosity seems to cause such a visceral aversion in a lot of people that they'd rather write a bunch of boring code to make sure a JSON parses to something sensible, or spend a day scratching their head about why a minor change in YAML caused everything to explode.

Actually my own problem with XML was annoyance that back when I had the thought of doing a complex config format in XML, the idea of modifying it programmatically while retaining comments turned out to be absolutely non-trivial. In comparison with the mess one can make with YAML that's just a trivial thing.

JSON does have data types, although there are not very many and not very good. For example, there is no octet string type (so you will have to use hex or base64 instead), no non-string keys (so you have to use strings instead), no character sets other than Unicode, no proper integer type (you will either have to use the existing numeric type or use a string instead; I have seen both ways done), etc.

YAML is worse in many ways, though.

XML has no data types but does have data structures.

I prefer to use DER (which also has some problems, but they are much less bad in my opinion).

"Any serious usage" starts at "it just works".

JSON just works. Every language worth giving a damn about has a half-decent parser, and the syntax is simple enough that you can write valid JSON by hand. You wouldn't hit the edgy edge cases or the need to use things like schemas until down the line, by which point you're already rolling with JSON.

XML doesn't "just work". There are like 4 decent libraries total, all extremely heavy, that have bindings in common languages, and the syntax is heavy and verbose. And by the time you could possibly get to "advanced features that make XML worth using", you've already bounced off the upfront cost of having to put up with XML.

Frontloading complexity ain't great for adoption - who would have thought.

> JSON just works.

Until it doesn't: underspecified numeric types and string types; parses poorly if there's a missing bracket; no built-in comments.

For many applications it's fine. I personally think it's a worse basis for a DSL, though.

That's my point. By the time you hit "until it doesn't", you're already doing JSON, and were for a while.

Also, is "parse well if there's a missing bracket" even a desirable property? If you get files with mangled syntax, something has already gone horribly wrong. And, chances are, there is no way to parse them that would be correct.

By "parses well" in that case I mean "can identify where the error is, and maybe even infer the missing closing tag if desirable;" i.e. error reporting and recovery.

If you've ever debugged a JSON parse error where the location of the error was the very end of a large document, and you're not sure where the missing bracket was, you'll know what I mean. (S-exprs have similar problems, BTW; LISPers rely on their editors so as not to come to grief, and things still sometimes go pear-shaped.)

> Actually my own problem with XML was annoyance that back when I had the thought of doing a complex config format in XML, the idea of modifying it programmatically while retaining comments turned out to be absolutely non-trivial. In comparison with the mess one can make with YAML that's just a trivial thing

Only relatively few parsing libraries preserve the token stream metadata in the AST, most don’t even expose the AST. For the former, I can understand why, it’s a cross-cutting concern and adds complexity to the AST parse, but is almost always worth it.

> JSON: No comments, no datatypes, no good system for validation.

I don't agree at all. With tools like Zod, it is much more pleasant to write schemas and validate the file than with XML. If you want comments, you can use JSON5 or YAML, that can be validated the same way.

I think you have it backward. Libraries like zod exist _because_ JSON is so ubiquitous. Someone could just as easily implement a zod for XML. I’m not a huge proponent of XML (hard to write, hard to parse), but what you describe are not technical limitations of the format.

I think that you're missing that the parent poster and I are implicitly assuming that XML is validated the most common way, i.e. with XSD, and that I'm comparing XSD validation and Zod.

Ah that’s fair. So the discussion is about the quality of the validation libraries?