The problem is that engineers of data formats have ignored the concept of layers. With network protocols, you make one layer (Ethernet), you add another layer (IP), then another (TCP), then another (HTTP). Each one fits inside the last, but is independent, and you can deal with them separately or together. Each one has a specialty and is used for certain things. The benefits are 1) you don't need "a kitchen sink", 2) you can replace layers as needed for your use-case, 3) you can ship them together or individually.
I don't think anyone designs formats this way, and I doubt any popular formats are designed for this. I'm not that familiar with enterprise/big-data formats so maybe one of them is?
For example: CSV is great, but obviously limited, and not specified all that well. A replacement table data format could be binary (it's 2026, let's stop "escaping quotes", and make room for binary data). Each row can have header metadata to define which columns are contained, so you can skip empty columns. Each cell can be any data format you want (specifically so you can layer!). The header at the beginning of the data format could (optionally) include an index of all the rows, or it could come at the end of the file. And this whole table data format could be wrapped by another format. Due to this design, you can embed it in other formats, you can choose how to define cells (pick a cell-data-format of your choosing to fit your data/type/etc, replace it later without replacing the whole table), you can view it out-of-order, you can stream it, and you can use an index.
> With network protocols, you make one layer (Ethernet), you add another layer (IP), then another (TCP), then another (HTTP). Each one fits inside the last, but is independent, and you can deal with them separately or together.
It looks neat when you illustrate it with stacked boxes or concentric circles, but real-world problems quickly show the ugly seams. For example, how do you handle encryption? There are arguments (and solutions!) for every layer, each with its own tradeoffs. But it can't be neatly slotted into the layered structure once and for all. Then you have things like session persistence, network mobility, you name it.
Data formats have other sets of tradeoffs pulling them in different directions, but I don't think that layered design would come near to solving any of them.
Eh, this escaping problem was basically solved ages ago.
If we really wanted to make a UTF-8 data interchange format that needs minimal escaping, we already have ␜ (FS File Separator U+001C), ␝ (GS Group Separator U+001D), ␞ (RS Row Separator U+001E), ␟ (US Unit Separator U+001F). The problem is that they suck to type out so they suck for character based interchange. But we could add them to that emoji keyboard widget on modern OSs that usually gets bound to <Meta> + <.>.
But if we put those someplace people could easily type them, that resolved the problem.
But, binary data? Eh, that really should be transmitted as binary data and not as data encoded in a character format. Like not only not using Base64, but also not using a character representation of a byte stream like "0x89504E470D0A1A0A...". Instead you should send a byte stream as a separate file.
So we need a way to combine a bunch of files into a streaming, compressed format.
And the thing is, we already have that format. It's .tar.lz4!
Have a look at Asset Administration Shells (AAS) -- it is a data exchange format built on top of JSON and XML (and RDF, and OPC UA and Protobuf, etc.).
https://industrialdigitaltwin.org/
(Disclaimer: I work on AAS SDKs https://github.com/aas-core-works.)
Some early binary formats followed similar concepts. Look up Interchange File Format, AIFF, RIFF, and their applications and all the file formats using this structure to this day.