> The main issue is that a reader might mistake Redis as a 2X faster postgres. Memory is 1000X faster than disk (SSD) and with network overhead Redis can still be 100X as fast as postgres for caching workloads.
Your comments suggest that you are definitely missing some key insights onto the topic.
If you, like the whole world, consume Redis through a network connection, it should be obvious to you that network is in fact the bottleneck.
Furthermore, using a RDBMS like Postgres may indeed imply storing data in a slower memory. However, you are ignoring the obvious fact that a service such as Postgres also has its own memory cache, and some query results can and are indeed fetched from RAM. Thus it's not like each and every single query forces a disk read.
And at the end of the day, what exactly is the performance tradeoff? And does it pay off to spend more on an in-memory cache like Redis to buy you the performance Delta?
That's why real world benchmarks like this one are important. They help people think through the problem and reassess their irrational beliefs. You may nitpick about setup and configuration and test patterns and choice of libraries. What you cannot refute are the real world numbers. You may argue they could be better if this and that, but the real world numbers are still there.
> If you, like the whole world, consume Redis through a network connection
I think "you are definitely missing some key insights onto the topic". The whole world is a lot bigger than your anecdotes.
> If you, like the whole world, consume Redis through a network connection, it should be obvious to you that network is in fact the bottleneck.
Not to be annoying - but... what?
I specifically _do not_ use Redis over a network. It's wildly fast. High volume data ingest use case - lots and lots of parallel queue workers. The database is over the network, Redis is local (socket). Yes, this means that each server running these workers has its own cache - that's fine, I'm using the cache for absolutely insane speed and I'm not caching huge objects of data. I don't persist it to disk, I don't care (well, it's not a big deal) if I lose the data - it'll rehydrate in such a case.
Try it some time, it's fun.
> And at the end of the day, what exactly is the performance tradeoff? And does it pay off to spend more on an in-memory cache like Redis to buy you the performance Delta?
Yes, yes it is.
> That's why real world benchmarks like this one are important.
That's not what this is though. Just about nobody who has a clue is using default configurations for things like PG or Redis.
> They help people think through the problem and reassess their irrational beliefs.
Ok but... um... you just stated that "the whole world" consumes redis through a network connection. (Which, IMO, is wrong tool for the job - sure it will work, but that's not where/how Redis shines)
> What you cannot refute are the real world numbers.
Where? This article is not that.
that is an interesting use case, I hadn't thought about a setup like this with a local redis cache before. Is it the typical advantages of using a db over a filesystem the reason to use redis instead of just reading from memory mapped files?
> Is it the typical advantages of using a db over a filesystem the reason to use redis instead of just reading from memory mapped files?
Eh - while surely not everyone has the benefits of doing so, I'm running Laravel and using Redis is just _really_ simple and easy. To do something via memory mapped files I'd have to implement quite a bit of stuff I don't want/need to (locking, serialization, ttl/expiration, etc).
Redis just works. Disable persistence, choose the eviction policy that fits the use, config for unix socket connection and you're _flying_.
My use case is generally data ingest of some sort where the processing workers (in my largest projects I'm talking about 50-80 concurrent processes chewing through tasks from a queue (also backed by redis) and are likely to end up running the same queries against the database (mysql) to get 'parent' records (ie: user associated with object by username, post by slug, etc) and there's no way to know if there will be multiples (ie: if we're processing 100k objects there might be 1 from UserA or there might be 5000 by UserA - where each one processing will need the object/record of UserA). This project in particular there's ~40 million of these 'user' records and hundreds of millions of related objects - so can't store/cache _all_ users locally - but sure would benefit from not querying for the same record 5000 times in a 10 second period.
For the most part, when caching these records over the network, the performance benefits were negligible (depending on the table) compared to just querying myqsl for them. They are just `select where id/slug =` queries. But when you lose that little bit of network latency and you can make _dozens_ of these calls to the cache in the time it would take to make a single networked call... it adds up real quick.
PHP has direct memory "shared memory" but again, it would require handling/implementing a bunch of stuff I just don't want to be responsible for - especially when it's so easy and performant to lean on Redis over a unix socket. If I needed to go faster than this I'd find another language and likely do something direct-to-memory style.