Two pieces of feedback here:
1. You implicitly take away someone else's hypothetical benefit of leveraging UUID v7, which is disappointing for any consumer of your API.
2. By storing the UUIDs differently on your API service from internally, you're going to make your life just a tiny bit harder because now you have to go through this indirection of conversion, and I'm not sure if this is worth it.
1. Unless API explicitly guarantees that property, relying on that is bad idea. I wouldn't.
> 1. Unless API explicitly guarantees that property, relying on that is bad idea. I wouldn't.
* https://www.hyrumslaw.comSure, but that's not really the point is it? If you get a UUID you can store it as a UUID. If the UUID happens to come around as a v7 you get some better behavior in your database, and if it does not, then it does not but there is nothing you can do about.
depends on the database, famously DynamoDB used to suffer from hotspotting when dealing with monotonically increasing keys
You're missing the point here. You can always go from ordered to randomness. You cannot go from randomness to ordered. So by intentionally removing the useful properties of UUIDv7, you're taking away some external API consumers' hypothetical possibility to leverage benefits. If I know (as an API consumer) that I have a database that for whatever reason prefers evenly distributed primary keys or something similar, I can always accomplish that by hashing. I just can never go the other way.
Never use someone else's synthetic key as your primary key. If you want ordered keys, even if the API is giving out sequential integers, you should still use your own sequential IDs.
I take your point, but I think your hypothetical is a wonderful example of Hyrum's Law. And for that reason, if I was going to go to the trouble of mapping my internal v7 uuids into something more random for public consumption, then I'd be sure generate something that doesn't look like a uuid at all so nobody gets any funny ideas about what they can do with it.
Just to clarify, do you mean that UUIDv4 in general is worse, or just this 7->4 obfuscation?
I'm not saying anything about better or worse. I'm saying that UUID v4 by definition has high entropy and UUID v7 does not. You can always go from low to high entropy, but not the other way around.
You can always treat IDs as UUIDv4, while actually storing them as UUIDv7—combining the benefits of both. From your perspective, they’re just UUIDv4
One impact of the_mitsuhiko's second point is during debugging.
Usually if you see an id in your http logs you can simply search your database for that id. The v4 to v7 indirection creates a small inconvenience.
The mismatch may be resolved if this was available as a fully transparent database optimization.
A Postgres extension is currently in development to provide transparent database optimization with custom type uuid45 and optional helpers ;)
That would generally be nice to have. I would love to have base62 encoded IDs with prefixes but store it internally as UUID.
Not just a small inconvenience—because there's no human readable way to tell the difference between v4 and v7 IDs, you have to guess and check whether or not the ID your server process is logging is a pre-conversion or post-conversion ID
The human readable way to tell the difference is you look at whether the third group starts with a 4 or a 7.
It is really easy to tell the difference btw. You will always see "4" or "7" in the middle.
This seems like the kind of tool you would only use where you have the following needs:
1. Not leaking timestamp data (security/regulations)
2. Having easily time-sortable primary keys (DB performance/etc.)
If you don't have both of these needs, the tool is an unnecessary indirection, as you've identified in (2).
However, where you do have both needs, some indirection is necessary. Whether this is the correct one is a different question.
Similarly, if you _must not_ leak timestamps for some real-world reason, (1) is an intrinsic requirement, consumers be damned.
If you must not leak timestamps then you also cannot really have timestamp ordering internally because you will happen to start leak that out in other ways through collection based endpoints.
Not necessarily. For instance, in situations where unprivileged users can only see single items but privileged users can see collections. But yeah, time-ordering leaks information to people who can see the collection.
This scheme potentially leaks timestamp, serialisation, and record-correlation data because the specification of UUIDv7 allows for partial timestamps and incrementing counters in the so-called random bits, which are passed through undisturbed.
So it is not generally fit for that purpose either.
Those seem like standard needs for any kind of CRUD app, so I would call this approach pretty useful. Currently I do something similar by keeping a private primary uuidv7 key with a btree index (a sortable index), and a separate public uuidv4 with a hash index (a lookup index), which is a workable but annoying arrangement. This solution achieves the same effect and is simpler.
Why can't you leak timestamp data? What timestamp data is sensitive to your system?
Also, why use UUIDs in that case?