Hacker News

> Both postgres and redis are used with the out of the box settings

Ugh. I know this gives the illusion of fairness, but it's not how any self-respecting software engineer should approach benchmarks. You have hardware. Perhaps you have virtualized hardware. You tune to the hardware. There simply isn't another way, if you want to be taken seriously.

Some will say that in a container-orchestrated environment, tuning goes out the window since "you never know" where the orchestrator will schedule the service but this is bogus. If you've got time to write a basic deployment config for the service on the orchestrator, you've also got time to at least size the memory usage configs for PostgreSQL and/or Redis. It's just that simple.

This is the kind of thing that is "hard and tedious" for only about five minutes of LLM query or web search time and then you don't need to revisit it again (unless you decide to change the orchestrator deployment config to give the service more/less resources). It doesn't invite controversy to right-size your persistence services, especially if you are going to publish the results.

IanCal 3 days ago [ - ]

I disagree. They found that Postgres, without tuning, was easily fast enough on low level hardware and would come with the benefit of not deploying another service. Additionally tuning it isn’t really relevant.

If the defaults are fine for a use case then unless I want to tune it for personal interest it’s either a poor use of my fun time or a poor use of my clients funds.

perrygeo 2 days ago [ - ]

The default shared memory is 128MiB, not even 1% of typical machines today. A benchmark run with these settings is effectively crippling your hardware by making sure 99% of your available memory is ignored by postgres. It's an invalid benchmark, unless redis is similarly crippled.

igneo676 2 days ago [ - ]

> If the defaults are fine for a use case then unless I want to tune it for personal interest it’s either a poor use of my fun time or a poor use of my clients funds.

It doesn't matter if you've crippled the benchmark if the performance of both options still exceeds your expectations. Not all of us are trying eek out every drop of performance

And, well, if you are then you can ignore the entire post because Redis offers better perf than postgres and you'd use that. It's that simple.

onraglanroad a day ago [ - ]

> Not all of us are trying eek out every drop of performance

You probably mean "eke out". Unless the performance is particularly scary :)

perrygeo a day ago [ - ]

good point, even postgres crippled was "good enough" so it doesn't change the overall message. Nonetheless, we should strive to do realistic and valid benchmarks, no?

spixy 10 hours ago [ - ]

wow, default 128MiB sounds so stupid

lemagedurage 3 days ago [ - ]

"If we don't need performance, we don't need caches" feels like a great broader takeaway here.

IanCal 2 days ago [ - ]

A cache being fast enough doesn’t mean no caching is relevant - I’m not sure why you’d equate the two.

indymike 2 days ago [ - ]

Sometimes, a cache is all about reducing expense: I.e, free cache query vs expensive API query.

amluto 2 days ago [ - ]

Sometimes people host software on a server they own or rent, the server is plenty fast, and it costs literally nothing to issue those queries at the scale on which they’re needed.

indymike 2 days ago [ - ]

Yes, that is true, but the original poster said getting rid of caches was always a good idea, when in reality the answer (as usual with engineering) is “it depends.”

hobs 2 days ago [ - ]

I see people downvoting this. Anyone who disagrees with this, we have YAGNI for a reason - if someone said to me my performance was fine and they added caches, I would look at them with a big hairy eyeball because we already know cache invalidation is a PITA, that correctness issues are easy to create, and now you have the performance of two different systems to manage.

Amazon actually moved away from caches for some parts of its system because consistent behavior is a feature, because what happens if your cache has problems and the interaction between that and your normal thing is slow? What if your cache has some bugs or edge case behavior? If you don't need it you are just doing a bunch of extra work to make sure things are in sync.

motorest 3 days ago [ - ]

> "If we don't need performance, we don't need caches" feels like a great broader takeaway here.

I don't think this holds true. Caches are used for reasons other than performance. For example, caches are used in some scenarios for stampede protection to mitigate DoS attacks.

Also, the impact of caches on performance is sometimes negative. With distributed caching, each match and put require a network request. Even when those calls don't leave a data center, they do cost far more than just reading a variable from memory. I already had the displeasure of stumbling upon a few scenarios where cache was prescribed in a cargo cult way and without any data backing up the assertion, and when we took a look at traces it was evident that the bottleneck was actually the cache itself.

ralegh 2 days ago [ - ]

DoS is a performance problem, if your server was infinitely fast with infinite storage they wouldnt be an issue.

motorest 2 days ago [ - ]

> DoS is a performance problem

Not really. Running out of computational resources to fulfill requests is not a performance issue. Think of thinks such as exhausting a connection pool. More often than not, some components of a system can't scale horizontally.

indymike 2 days ago [ - ]

It is actually a financial problem too. Servers stop working when the bill goes unpaid. Sad but true.

lomase 2 days ago [ - ]

If my gandma had wheels it would be a car.

re-thc 2 days ago [ - ]

> They found that Postgres, without tuning, was easily fast enough on low level hardware

Is that production? When you basket it into "low level" it sounds like a base case but it really isn't.

In production you don't have local storage, RAM being used for all kinds of other things, your CPU only available in small slices, network effects and many others.

> If the defaults are fine for a use case

Which I hope isn't the developer's edition of it works on my machine.

Timshel 3 days ago [ - ]

> for only about five minutes of LLM query or web search

I think I have more trust in the PG defaults that in the output of a LLM or copy pasting some configuration I might not really understand ...

Implicated 2 days ago [ - ]

> copy pasting some configuration I might not really understand

Uh, yea... why would you? Do you do that for configurations you found that weren't from LLMs? I didn't think so.

I see takes like this all the time and I'm really just mind-boggled by it.

There are more than just the "prompt it and use what it gives me" use cases with the LLMs. You don't have to be that rigid. They're incredible learning and teaching tools. I'd argue that the single best use case for these things is as a research and learning tool for those who are curious.

Quite often I will query Claude about things I don't know and it will tell me things. Then I will dig deeper into those things myself. Then I will query further. Then I will ask it details where I'm curious. I won't blindly follow or trust it like I wouldn't a professor or anyone or any thing else, for that matter. Just like I would when querying a human for or the internet in general for information, I'll verify.

You don't have to trust it's code, or it's configurations. But you can sure learn a lot from them, particularly when you know how to ask the right questions. Which, hold onto your chairs, only takes some experience and language skills.

Timshel 2 days ago [ - ]

My comment is mainly in opposition to the "five minutes" part from parent.

If you have 5 minutes then you can't as you say :

> Then I will dig deeper into those things myself ...

So my point is I don't care if it's coming from LLM or a random blog, you won't have time to know if it's really working (ideally you would want to benchmark the change).

If you can't invest the time better to stay with the defaults, which in most project the maintainers spent quite a bit of time to make sensible.

ezekiel68 2 days ago [ - ]

Original commenter here. I don't disagree with your larger point. However, it turns out that the default settings for PostgreSQL have been super conservative for years; as a stable piece of infrastructure they seem to prefer defaulting to a constrained environment rather than making assumptions about resources. To their credit, PostgreSQL does ship with sample configs for "medium" and "large" deployments which are well-documented with comments and can be simply copied over the original default config.

I happen to have a good bit of experience with PostgreSQL, so that colored the "5 minutes" part of it. Still, most of the time, you "have" more than 5 minutes to create the orchestrator's deployment config for the service (which never exists by default on any k8s-based orchestrator). I'm simply saying to not be negligent of the service's own config, even though a default exists.

Implicated 2 days ago [ - ]

Yea, I guess in that case I'd say it's likely a bad move in every direction if you're constrained to 5 min to deploy something you don't understand.

rollcat 2 days ago [ - ]

It's crazy how wildly inaccurate "top-of-the-list" LLMs are for straightforward yet slightly nuanced inquiries.

I've asked ChatGPT to summarize Go build constraints, especially in the context of CPU microarchitectures (e.g. mapping "amd64.v2" to GOARCH=amd64 GOAMD64=v2). It repeatedly smashed its head on GORISCV64, claiming all sorts of nonsense such as v1, v2; then G, IMAFD, Zicsr; only arriving at rva20u64 et al under hand-holding. Similar nonsense for GOARM64 and GOWASM. It was all right there in e.g. the docs for [cmd/go].

This is the future of computer engineering. Brace yourselves.

yomismoaqui 2 days ago [ - ]

If you are going to ask ChatGPT some specific tidbit it's better to force it to search on the web.

Remember, an LLM is a JPG of all the text of the internet.

dgfitz 2 days ago [ - ]

Wait, what?

Isn't that the whole point, to ask it specific tidbits of information? Are we to ask it large, generic pontifications and claim success when we get large, generic pontifications back?

The narrative around these things changes weekly.