One approach to prevent being hit by "stack rot" is to build everything on top of plain Linux VMs.

Those rarely change.

Unfortunately the software deployed on top of them will.

So you either:

1) postpone all your updates for years until a bad CVE hits and you need to update or some application goes end of life and you’re screwed because updating becomes a massive exercise

2) do regular updates and patches to the entire stack, including Linux, in which case, you’re in the same position you were before with running on the stack rot treadmill

So you might’ve moved the rot to a different place, but I don’t know if you’ve reduced any of it. I’ve owned stuff deployed off of vanilla VMs and I actually found it harder to maintain because everything was a one-off.

My rationale for staying up to date aggressively is that it minimizes integration work. Basically integration work multiplies, it doesn't just accumulate. So the further you fall behind the more that can break when you finally do upgrade. And you create needlessly more work related to testing and fixing all that. Upgrading a system that was fully up to date until a few days/weeks ago is generally easy. There's only so much that changes. Doing the same to something that was last touched five years ago can be a bit painful. Apis that no longer exist. Components that are no longer supported. Code that no longer compiles. Etc.

I see a lot of teams being overly conservative with keeping their stuff up to date running with years out of date stuff with lots of known & fixed bugs of all varieties, performance issues that have long since been addressed, etc. All in the name of stability.

I treat anything that isn't up to date as technical debt. If an update breaks stuff, I need to know so I can either deal with it or document a work around or a (usually temporary) version rollback. While that happens, it doesn't happen a lot. And I prefer knowing about these things because I've tried it over being ignorant of the breakage because I haven't updated anything in years. It just adds to the hidden pile of technical debt you don't even know you have. Ignorance is not an excuse for not dealing with your technical debt. Or worse compounding it by building on top of it and creating more technical debt in the process.

Dealing with small changes over time is a lot less work than with dealing with a large delta all at once. It's something I've been doing for years. If I work on any of my projects, the first thing I do is update dependencies. Make sure stuff still works (tests). Make sure deprecated APIs are dealt with.

If you’re willing to put in the maintenance work, you’ll probably be in good shape whether you’re on plain VMs or a snazzy cloud provider managed service.

If business understands that you need time to work on these things :’)

RedShift1 complains that GCP is "deprecating stuff". I wouldn't put doing regular updates in the same problem category as having to deal with part of your stack disappearing.

To me "I wish they would stop deprecating stuff" sounds like any part of the stack has something like a 1% or even 10% chance in any given year to be shut off.

I would expect that by carefully choosing your stack from open source software in the Debian repos, you can bring the probability of any given part being gone with no successor to less than 0.1% per year. As an example - could you imagine Python becoming unavailable in 2026? Or SQLite? Docker?

Fair - I’m not very experienced with GCP, but I’ve seen AWS keep the deprecation treadmill moving as well.

In general, if I’m going to be maintaining stuff, I guess I’d rather be maintaining cloud than like… old Solaris or something.

But then you have to maintain it yourself, which overall usually wilk be more work than just migrating from time to time

Your cost just shifts elsewhere, then. Rolling your own stack from Linux up is a big endeavor too.