I love how we have come full circle. Anybody remembers the "semantic web" (RDF-based knowledge graph)? It didn't take off because building and maintaining such a graph requires extensive knowledge engineering work and tools. Fast forward a couple of decades and we have LLMs, which is basically auto-complete on steroids based on general knowledge, with the downside that it doesn't "remember" any facts unless you spoon-feed it with the right context. We're now back to: "let's encode context knowledge as a graph and plug it into LLMs". Fun times :)
The problem with semantic web was deeper, people had to agree on the semantics that would be formalized as triples and getting people to agree on an ongoing basis is not an easy task.
My question is, what’s the value of explicitly storing semantics as triples when the LLM can infer the semantics on runtime?
Not much tbh. I'm using markdown files as a memory bank[1] for my projects and it works well without the need to structure them in a schema/graph. But I guess one benefit of this particular memory graph implementation is its temporal aspect: searchable facts can evolve over time; i.e. what is true now and how it got here.
[1] https://docs.cline.bot/prompting/cline-memory-bank
That’s interesting! I’ll take a deeper look. Thanks for sharing
This is something we brainstorm a lot on. While LLMs can infer semantics at runtime, we got biased to explicit triples for these reasons:
Efficient, precise retrieval through graph traversal patterns that flat text simply can't match ("find all X related to Y through relationship Z")
Algorithmic contradiction detection by matching subject-predicate pairs across time, which LLMs struggle with across distant contexts
Our goal is also to make assistant more proactive, where triplets make pattern recognition more easy and effective
what do you think about these?
Hey - well put!
I guess "semantic web" folks were right about the destination, just few years early :P