That's nice, I've had the issue where LLMs would return non-existent uids. But does this package actually help with that? Token savings are nice, but not really my main concern. If this can measurably reduce hallucinations, it would be really useful.
> Where UUIDs cost ~23 tokens and get hallucinated by LLMs, id-agent produces memorable word-based IDs at ~14 tokens with equivalent collision resistance.
It seems like the right solution is around the corner: placeholders for these kinds of strings (uuid, hash, etc)
Why should an LLM even have these types of IDs anywhere in the prediction pipeline?
My gut feeling is that the hallucinations are caused by the entropy. A UUID has unlikely character sequences. But the entropy is a core feature. Turning the UUID into words keeps the same entropy, you just have surprising words instead of surprising hex sequences.
I would be surprised if this actually helped with hallucinations. Happy to be proven wrong though, and this seems like an easy experiment to run: just take a tiny model (below 1B) and have it transcribe a couple thousand ids in both formats, then check where it made more mistakes
I had similar thoughts. The readme intro explicitly mentions hallucinations, that's why I thought I'd ask.
If you're dealing with uid in -> uid out, where you're hoping to get the same uid out, intuitively the entropy would be greatly reduced anyways. Then the question becomes, are words conducive to keeping input->output consistent, given the way LLMs work (e.g. attention mechanism)? I could see it go either way, that's why I'm supporting the idea of running your experiment.
But within the surprising words, the adjacent tokens are common. I can see an argument for having fewer transcription errors on badger-yellow-alternate than 0B9A26F3C74D.
Your test with small models makes tons of sense. Would be interesting to graph to two approaches against model size and recency.
Yes, we have the validation methods to verify the output. https://github.com/vostride/id-agent/#validateid
A random "-" separated words will fail the validation check.
Okay, but you can also validate uids. What I'm asking is whether the human readable uids cause fewer hallucinations, as that would be the real win imo.