I feel like I must be missing something, because I don’t understand why representing data in an s-expression is better than representing it as nested arrays (lists of lists) and hashtables/dictionaries. I also don’t see why representing data in a language’s data structure is inherently better than representing it in a language-agnostic format like JSON, and having libraries to parse and convert data from that format into a language data structure, or a library that defines a JSON type, or storing the JSON in a string and using a library that operates on strings that contains valid JSON. I’ve worked with data in this way (JSON on disk, converted to nested arrays/tables in code) and haven’t felt it to be painful.
I can see how one might have a taste preference for it, but I’m struggling to understand what the tangible benefits are.
> I don’t understand why representing data in an s-expression is better than representing it as nested arrays (lists of lists) and hashtables/dictionaries.
An s-expression is a list. An s-expression like (list (list 1 2) (list 3 4) (list 5 6)) is a list of lists. An s-expression like (hash "a" 1 "b" 2) is as hash table/dictionary.
> I also don’t see why representing data in a language’s data structure is inherently better than representing it in a language-agnostic format like JSON
You don't see why having a language data structure like Date is better than having a date-string stored in a JSON value that you need to provide parsing and other functions for? If you have a language data structure like Date, you can add days to a date, extract the month, convert it to a DateTime, etc. If you just have a JSON value, you either need to provide those functions or convert your JSON value to the Date language data structure. It seems like you see the value in using the language's data structure because you then say:
> having libraries to parse and convert data from that format into a language data structure
Also, JSON is just as "language agnostic" as s-expressions. JSON happens to be a first class component of JavaScript, as s-expressions are a first class component of Lisp; if libraries exist to help you deal with JSON in other languages, so, too, can libraries exist to help you deal with s-expressions.
> I’m struggling to understand what the tangible benefits are.
I think you understand the tangible benefits of JSON? It is a human readable/writable data serialization format. It is integrated in JavaScript in such a way that you can easily serialize, parse, and extract data from it without reaching for a library. S-expressions within Lisp do that, but they don't limit you to strings, floats, arrays, and unsorted maps. You don't need to write conversion functions or use them from a library because reading and writing s-expressions are core parts of Lisp.
I appreciate your time and explanation. I'm really trying to understand the POV here, and I feel like we're veering away from my original confusion, which was around "Try defining data in C. Try extracting data from that data you've defined in C". I'm assuming your statement would be meant to apply to other common languages without s-expressions, but maybe I've misunderstood.
I don't get why Lisp's s-expressions are much better than using arrays/tables in another language, such that they are a justification for using the language. Are they only significantly superior over a language with only arrays, like C? What's something that is made significantly easier by an s-expression than by arrays/tables?
To make s-expressions language-agnostic, wouldn't you need libraries in the languages to convert between the s-expression as it exists in some specification, and the language's native data structures? This doesn't sound all that different from JSON at this point, or a much more complex specification that defines the representation of all kinds of types, like dates.
> I'm assuming your statement would be meant to apply to other common languages without s-expressions, but maybe I've misunderstood.
It was meant for C specifically. Take a JSON document. Define it in C. Here's what I see on a random cJSON GitHub project:
https://github.com/DaveGamble/cJSON/blob/master/README.md#ex...
Now do that in JavaScript:
Now make the equivalent jsonDoc in Lisp with s-expressions: The C approach is the approach you'd similarly take in many languages where you create the HashMap, then create the Array, then populate them. Of course, you could "cheat" in many languages by first making a string and then calling the JSON library's `parse` on the string. But, this is different than JavaScript where you can directly create the JSON document. In Lisp, you are always writing s-expressions, both for data and code.> What's something that is made significantly easier by an s-expression than by arrays/tables?
An s-expression is a form of syntax. Even though it is a "list", the s-expression (list 1 2 3) is an actual list. It's not like you're taking the idea of arrays and tables and replacing it with a list. It's like you're taking the idea:
And replacing it with the idea: What about dates? What if we want rationals? What if we want to use a binary-search-tree-map instead of a hash? Lisp: The Lisp examples are simplified, but that is the idea.> To make s-expressions language-agnostic, wouldn't you need libraries in the languages to convert between the s-expression as it exists in some specification, and the language's native data structures?
Yes.
> This doesn't sound all that different from JSON at this point, or a much more complex specification that defines the representation of all kinds of types, like dates.
Correct. It would just be like JSON and whatever bits are standardized are what would be handled.
Hash tables or dictionaries can be S-expressions.
Some Lisp dialects do not have printed-representations for these; that is a bug, and needs no further discussion.
Common Lisp and Scheme have vectors: they are notated as #(...). That is an S-expression.
CLISP has a #<N>A(...) notation for multidimensional arrays, which it can print and read:
There is a little bit of a restriction in that backquote doesn't support multi-dimensional arrays: But this does work for vectors (as required by ANSI CL, I think): You might think that #(0 42 0) is just some list variation, but in fact it is a bona-fide vector object, with fast numeric indexing, and without the structural flexibility of lists. Vectors are typically implemented as flat arrays in memory. (Though they could use something else, particularly if large, like radix trees.)Hash literals look like this in TXR Lisp:
You can see the order of the keys changed when it was echoed back. The first element in the #H syntax, (), gives properties. It's empty for the most general form of hash, which uses equal comparison, and doesn't have weak keys or values.Binary search trees are likewise printable. The #T notation gives a tree (concealing the nodes). The #N notation for individual tree nodes:
The #T notation is readable. It gives the values in order, but they don't have to be specified in order when you write a literal: If we ask for the tree root, we see the actual nodes, and how they are linked together: All these notations are something we can casually use as data in a file or network stream between TXR Lisp programs, or anything else that cares to read and write then notation.The printed notation of any object in a Lisp is a S-expression. S-expression syntax usually strives for print-read consistency: when an object's printed notation is read by the machine, a similar object is recovere. (In some cases, the original object itself, as with interned symbols).