"start self-hosting more of your personal services."
I would make the case that you should also self host more as a small Software/SAAS business and it is not quite the boogeyman that a lot of cloud vendors want you to think.
Here is why. Most software projects/businesses don't require the scale and complexity for which you truly need the cloud vendors and their expertise. For example, you don't need Vercel to deploy NextJS or whatever static website or even netlify. You can setup Nginx or Caddy (my favorite) on a simple VPS with Ubuntu etc and boom. For majority of projects, that will do.
90%+ of projects can be self hosted with the following:
- A well hardened VPS server with good security controls. Plenty of good articles online on how to do the most important things (remove root login, ssh should only be key based etc).
- Setup a reverse proxy like Caddy (my favorite) or Nginx etc. Boom. Static files can now be served. Static websites can be served. No need for CDN etc unless you are talking about millions of requests per day.
- Setup your backend/API with something simple like supervisor or even the native systemd.
- The same Reverse proxy can also forward requests to backend and other services as needed. Not that hard.
- Self host a mysql/postgres database and setup the right security controls.
- Most importantly: Setup backups for everything using a script/cron and test them periodically.
- IF you really want to feel safe against DOS/DDOS etc, add cloudflare in front of everything.
So you end up with:
Cloudflare/DNS=>Reverse Proxy (Caddy/Nginx)=>Your App.
- You want to deploy ? Git pull should do it for most projects like PHP etc. If you have to rebuild binary, it will be another step but possible.
You don't need Docker or containers. They can help but not needed for small to even mid sized projects.
Yes, you can claim that a lot of these things are hard and I would say they are not that hard. Majority of projects don't need the web scale or whatever.
The main thing that gives me anxiety about this is the security surface area associated with "managing" a whole OS— kernel, userland, all of it. Like did I get the firewall configured correctly, am I staying on top of the latest CVEs, etc.
For that reason alone I'd be tempted to do GHA workflow -> build container image and push to private registry -> trivial k8s config that deploys that container with the proper ports exposed.
Run that on someone else's managed k8s setup (or Talos if I'm self hosting) and it's basically exactly as easy as having done it on my own VM but this way I'm only responsible for my application and its interface.
I left my VPS open to password logins for over 3 years, no security updates, no firewalls, no kernel updates, no apt upgrades; only fail2ban and I survived: https://oxal.org/blog/my-vps-security-mess/
Don't be me, but even if you royally mess up things won't be as bad as you think.
I've had password login enabled for decades on my home server, not even fail2ban. But I do have an "AllowUsers" list with three non-cryptic user names. (None of them are my domain name, but nice try.)
Last month I had 250k failed password attempts. If I had a "weak" password of 6 random letters (I don't), and all 250k had guessed a valid username (only 23 managed that), that would give... uh, one expected success every 70 years?
That sounds risky actually. So don't expose a "root" user with a 6-letter password. Add two more letters and it is 40k years. Or use a strong password and forget about those random attempts.
I wonder about:
- silently compromised systems, active but unknown
- VPS provider doing security behind your back
I'd be worried about this too. Like there must be AI bots that "try the doors" on known exploits all over the internet, and once inside just do nothing but take a look around and give themselves access for the future. Maybe they become a botnet someday, but maybe the agent never saw the server doing anything of value worth waking up its master for— running a crypto wallet, a shard of a database with a "payments" table, an instance of a password manager like Vault, or who knows what else might get flagged as interesting.
Security is way more nuanced than "hey look I left my door open and nothing happened!". You are suggesting, perhaps inadvertently, a very dangerous thing.
> Run that on someone else's managed k8s setup ... this way I'm only responsible for my application and its interface.
It's the eternal trade-off of security vs. convenience. The downside of this approach is that if there is a vulnerability, you will need to wait on someone else to get the fix out. Probably fine nearly always, but you are giving up some flexibility.
Another way to get a reasonable handle on the "managing a whole OS ..." complexity is to use some tools that make it easier for you, even if it's still "manually" done.
Personally, I like FreeBSD + ZFS-on-root, which gives "boot environments"[1], which lets you do OS upgrades worry-free, since you can always rollback to the old working BE.
But also I'm just an old fart who runs stuff on bare metal in my basement and hasn't gotten into k8s, so YMMV (:
[1] eg: https://vermaden.wordpress.com/2021/02/23/upgrade-freebsd-wi... (though I do note that BEs can be accomplished without ZFS, just not quite as featureful. See: https://forums.freebsd.org/threads/ufs-boot-environments.796...)
I used digital ocean for hosting a wordpress blog.
It got attacked pretty regularly.
I would never host an open server from my own home network for sure.
This is the main value add I see in cloud deployments -> os patching, security, trivial stuff I don't want to have to deal with on the regular but it's super important.
Wordpress is just low-hanging fruit for attackers. Ideally the default behavior should be to expose /wp-admin on a completely separate network, behind a VPN, but no one does that, so you have to run fail2ban or similar to stop the flood of /wp-admin/admin.php requests in your logs, and deal with Wordpress CVEs and updates.
More ideal: don't run Wordpress. A static site doesn't execute code on your server and can't be used as an attack vector. They are also perfectly cacheable via your CDN of choice (Cloudflare, whatever).
A static site does run on a web server.
Yes, but the web server is just reading files from disk and not invoking an application server. So if you keep your web server up to date, you are at a much lesser risk than if you would also have to keep your application + programming environment secure.
That really depends on the web server, and the web app you'd otherwise be writing. If it's a shitty static web server, than a JVM or BEAM based web app might be safer actually.
a static site is served by a webserver, but the software to generate it runs elsewhere.
Yes. And a web server has an attack surface, no?
The thing with WordPress is that it increases the attack area using shitty plugins. If I have a WP site, I change wp-config.php with this line:
This one config will save you lot of headaches. It will disable any theme/plugin changes from the admin dashboard and ensures that no one can write to the codebase directly unless you have access to the actual server.You can mitigate a lot of security issues by not exposing your self-hosted stack to the Internet directly. Instead you can use a VPN to your home network.
An alternative is a front-end proxy on a box with a managed OS, like OpenWRT.
> gives me anxiety about this is the security surface
I hate how dev-ops has adopted and deploys the fine-grained RBAC permissions on clouds. Every little damn thing is a ticket for a permissions request. Many times it's not even clear which permission sets are needed. It takes many iterations to wade through the various arbitrary permission gates that clouds have invented.
They orgs are pretending like they're operating a bank, in staging.
This gives me anxiety.
This is why I built https://canine.sh -- to make installing all that stuff a single step. I was the cofounder of a small SaaS that was blowing >$500k / year on our cloud stack
Within the first few weeks, you'll realize you also need sentry, otherwise, errors in production just become digging through logs. Thats a +$40 / m cloud service.
Then you'll want something like datadog because someone is reporting somewhere that a page is taking 10 seconds to load, but you can't replicate it. +$300 / m cloud service.
Then, if you ever want to aggregate data into a dashboard to present to customers -- Looker / Tableau / Omni +$20k / year.
Data warehouse + replication? +$150k / year
This goes on and on and on. The holy grail is to be able to run ALL of these external services in your own infrastructure on a common platform with some level of maintainability.
Cloud Sentry -> Self Hosted Sentry
Datadog -> Self Hosted Prometheus / Grafana
Looker -> Self Hosted Metabase
Snowflake -> Self Hosted Clickhouse
ETL -> Self Hosted Airbyte
Most companies realize this eventually and thats why they eventually move to Kubernetes. I think its also why often indie hackers can't quite understand why the "complexity" of Kubernetes is necessary, and just having everything run on a single VPS isn't enough for everything.
This assumes you're building a SAAS with customers though. When I started my career it was common for companies to build their own apps for themselves, not for all companies to be split between SAAS builders and SAAS users.
I enjoy the flipside... working for a company that does provides SAAS, sometimes I find myself reminding people that we don't necessarily need a full multi-datacenter redundant deploy with logging and metrics and alerting and all this other modern stuff for a convenience service, used strictly internally, infrequently (but enough to be worth having), with effectively zero immediate consequences if it goes down for a bit.
You can take that too far, of course, and if you've got the procedures all set up you often might as well take them, but at the same time, you can blow a thousands and thousands of dollars really quickly to save yourself a minor inconvenience or two over the course of five years.
I'm also in this space - https://disco.cloud/ - similarly to you, we offer an open source alternative to Heroku.
As you well know, there are a lot of players and options (which is great!), including DHH's Kamal, Flightcontrol, SST, and others. Some are k8s based - Porter and Northflank, yours. Others, not.
Two discussion points: one, I think it's completely fair for an indie hacker, or a small startup (Heroku's and our main customers - presumably yours too), to go with some ~Docker-based, git-push-compatible deployment solution and be completely content. We used to run servers with nginx and apache on them without k8s. Not that much has changed.
Two, I also think that some of the needs you describe could be considered outside of the scope of "infra": a database + replication, etc. from Crunchy Bridge, AWS RDS, Neon, etc. - of course.
But tableau? And I'm not sure that I get what you mean by 150k/year - how much replication are we talking about? :-)
Yeah so happy to share how that happened to us.
If you want to host a redshift instance, and get Google Analytics logs + twilio logs + stripe payments + your application database into a datawarehouse, then graph all that in a centralized place (tableau / looker / etc)
A common way to do that is:
- Fivetran for data streaming
- Redshift for data warehousing
- Looker for dashboarding
You're looking at $150k / year easily.
Yeah, you are very right.
If you start peaking success, you realize that while your happy path may work for 70% of real cases, it's not really optimal to convert for most of them. Sentry helps a lot, you see session replay, you get excited.
You realize you can A/B test... but you need a tool for that...
Problem: Things like Openreplay will just crash and not restart themselves, with multiple container setups, some random part going down will just stop your session collection, without you noticing.. try to debug that? Goodluck, it'll take at least half a day. And often, you restore functionality, only to have another random error take it down a couple of months later, or you realize, the default configuration is only to keep 500mb of logs/recordings (what), etc, etc...
You realize you are saving $40/month for a very big hassle and worse, it may not work when you need it. You go back to sentry etc..
Does Canine change that?
Canine just makes deploying sentry / grafana / airbyte + 15k other OS packages a one click install, which then just gives you a URL you can use. Because its running on top of kubernetes, a well built package should have healthchecks which will detect an error and auto-restart the instance.
Obviously if [name your tool] is built so that it can be bricked [1], even after a restart, then you'll have to figure it out. Hopefully most services are more robust than that. But otherwise, Kubernetes takes care of the uptime for you.
[1] This happened with a travis CI instance we were running back in the day that set a Redis lock, then crashed, and refused to restart so long as the lock was set. No amount of restarts fixed that, it required manual intervention
> Majority of projects don't need the web scale or whatever.
Truth. All the major cloud platform marketing is YAGNI but for infrastructure instead of libraries/code.
As someone who works in ops and has since starting as a sysadmin in the early 00s, it's been entertaining to say the least to watch everyone rediscover hosting your own stuff as if it's some new innovation and wasn't ever possible before. It's like that old MongoDB is web scale video (https://www.youtube.com/watch?v=b2F-DItXtZs)
Watching devs discover Docker was similarly entertaining back then when us in ops have been using LXC and BSD jails, etc. to containerize code pre-DevOps.
Anyway, all that to say - go buy your gray beard sysadmins a coffee let them help you. We would all be thrilled to bring stuff back on-prem or go back to self hosting and running infra again and probably have a few tricks to teach you.
And there is an extra perk: Unlike cloud services, system skills and knowledge are portable. Once you learn how systemd or ufw or ssh works, you can apply it to any other system.
I’d even go as far as to say that the time/cost required to say learn the quirks of Docker and containers and layering builds is higher than what is needed to learn how to administer a website on a Debian server.
Well said. For me, "how to administer a website on a Debian server" is a must if you work in Web Dev because hosting a web app should not require you to depend on anyone else.
>I’d even go as far as to say that the time/cost required to say learn the quirks of Docker and containers and layering builds is higher than what is needed to learn how to administer a website on a Debian server.
But that is irrelevant as Docker brings more to the table that a simple Debian server cannot by design. One could argue that lxd is sufficient for these, but that is even more hassle than Docker.
For my home or personal server stuff... I'm pretty much using ProxMox as a VM host, with Ubuntu Server as the main server and mostly Docker configured with Caddy installed on the host. Most apps are stacked in /apps/appname/docker-compose.yaml with data directories mounted underneath. This just tends to simplify most of my backup/restore/migrate etc.
I just don't have the need to do a lot of work on the barebones server beyond basic ufw and getting Caddy and Docker running... Caddy will reverse-proxy all the apps running in containers. It really simplifies my setup.
That's essentially what I do (with a little extra step of having a dedicated server in Hetzner peered with my homelab with wireguard to use as internet facing proxy + offsite backup server).
Ah, also docker is managed with komo.do, but otherwise it is simple GUI over docker-compose
That's cool... I really should take a next step to bridge my home setup with my OVH server. It's a mixed bag, mostly in that the upgrade to "business" class from home is more than what I pay a month to rent the full server and IP block on OVH... But I've got a relatively big NAS at home I wouldn't mind leveraging more/better.
Aside: I really want to like NextCloud, but as much as I like aspects of it, I don't like plenty as well.
I think the original vision where containers "abstract" the platform to such an extend that you can basically deploy your dev environment has been somewhat diminished. The complexity of the ecosystem has grown to such an extend, that we need tools to manage the tools that help us manage our services.
And that's not even considering the "tied to a single corporation" problem. If us-east-1 wakes up tomorrow and decides to become gyliat-prime-1, we're all screwed because no more npm, no more docker, no more CloudFlare (because someone convinced everyone to deploy captchas etc).
Mostly agreed, and thanks for sharing your POV. One slight disagreement:
"No need for CDN etc unless you are talking about millions of requests per day."
A CDN isn't just for scale (offloading requests to origin), it's also for user-perceived latency. Speed is arguably the most important feature. Yes, beware premature optimization... but IMHO delivering static assets from the edge, close as possible to the user, is borderline table stakes and has been for at least a decade.
You're right, including your warning of premature optimization, but if the premise of the thread is starting from a VPS, user-perceived latency shouldn't be as wild as self-hosting in a basement or something because odds are your VPS is on a beefy host with big links and good peering anyway. If anything, I'd use the CDN as one more layer between me and the world, but the premise also presupposed a well-hardened server. Personally, the db and web host being together gave me itches, but all things secure and equal, it's a small risk.
Oh you can go a lot simpler than that.
For 20 years I ran a web dev company that hosted bespoke web sites for theatre companies and restaurants. We ran FreeBSD, postgreSQL and nginx or H2O-server with sendmail.
Never an issue and had fun doing it.
Are you talking about the VPS just serving as a reverse proxy and running the server on-prem or at home? Or are you having a reverse proxy on some VPS with a static IP send connections to other VPSs on a cloud service? I've self-hosted toy apps at home this way, with a static IP VPS as a reverse proxy in the middle, and it is indeed easy. With Tailscale you don't even need a static IP at home. A gigabit connection and a $100 box can easily handle plenty of load for a small app or a static site. To me, the reason I would never put something like that into production even for a very small client is, even in a fairly wealthy city in America, the downtime caused by electrical outages and local internet outages would be unacceptable.
Exactly; but I would rather say that you don't need CDN unless you have tens of thousands of requests per second and your user base is global; single powerful machine can easily handle thousands and tens of thousands of requests per second
Issue is network saturation. Most VPSs have limited bandwidth (1Gbps), even if their CPUs could serve tens of thousands of req/s.
Even 1 Gbps is plenty to handle 1,000 connections unless you're serving up video.
That's 1 Mbps per user. If your web page can't render (ignoring image loading) within a couple seconds even on a connection that slow, you're doing something wrong. Maybe stop using 20 different trackers and shoving several megabytes of JavaScript to the user.
The thread is about self-hosted CDN capabilities. Serving large images and video is what CDNs are for. I’m just talking technical limitations here, chill a little bit with the “your web page”.
You can always host your stuff on a few machines and then create a few DNS A records to load balance it on the DNS level :)
This sometimes works, sometimes not. Because of how DNS resolution works, you're totally at the mercy of how your DNS resolver and/or application behave.
Agreed. I was being generous to the CDN lovers :). Peope don't know how powerful static file servers like Nginx and Caddy are. You don't need no CDN.
For me, CDN is more valuable for avoidance of huge data transfer bills from the origin host, vs the endpoint getting overwhelmed. Obviously those are related and both could happen without a CDN, but the big bills scare me more at the end of the day.
A small business should onlyself host if they are a hosting company. everyone else should pay their local small business self hosting company to host for them.
This is not a job for the big guys. You want someone local who will take care of you. They also come when a computer fails, ensuring updates are applied to them. Not by come I mean physically sending a human to you. This will cost some money but you should be running your business not trying to learn computers.
I meant a small software/SAAS business. I would agree with you about a non software business. Edited my comment.
[dead]
Everything you said 110%
I wish I understood why some engineers feel the need to over-engineer the crap out of everything.
Is it because of wishful thinking? They think their blog is gonna eventually become so popular that they're handling thousands of requests per second, and they want to scale up NOW?
I just think about what web servers looked like at the turn of the millennium. We didn't have all these levels of abstraction. No containers. Not even VMs. And we did it on hardware that would be considered so weak that they'd be considered utterly worthless by today's standards.
And yet...it worked. We did it.
Now we've got hardware that is literally over 1000 times faster. Faster clocks, more cache, higher IPC, and multiple cores. And I feel like half of the performance gains are being thrown away by adding needless abstractions and overhead.
FFS...how many websites were doing just fine with a simple LAMP stack?
I think this is a moderately good idea, if you are certain that you want to remain involved with the business operationally, forever.
It's still not ever a great idea (unless, maybe, this is what you do for a living for your customers), simply because it binds your time, which will absolutely be your scarcest asset if your business does anything.
I am speaking from my acute experience.
> No need for CDN etc unless you are talking about millions of requests per day.
Both caddy and nginx can handle 100s of millions of static requests per day on any off-the-shelf computer without breaking a sweat. You will run into network capacity issues long before you are bottlenecked by the web server software.