If you'd have asked me a few years ago if anything could be an existential threat to github's dominance in the tech community I'd have quickly said no.
If they don't get their ops house in order, this will go down as an all-time own goal in our industry.
Github lost at least one 9, if not two, since last year's "existential" migration to Azure.
I'm pretty sure they don't GAF about GH uptime as long as they can keep training models on it (0.5 /s), but Azure is revenue friction so might be a real problem.
Something this week about "oops we need a quality czar": https://news.ycombinator.com/item?id=46903802
> (0.5 /s),
Does this mean you are only half-sarcastic/half-joking? Or did I interpret that wrong?
Yes that's it.
I'm sympathetic to ops issues, and particularly sympathetic to ops issues that are caused by brain-dead corporate mandates, but you don't get to be an infrastructure company and have this uptime record.
It's extra galling that they advertise all the new buzzword laden AI pipeline features while the regular website and actions fail constantly. Academically I know that it's not the same people building those as fixing bugs and running infra, but the leadership is just clearly failing to properly steer the ship here.
They didn't migrate yet.
Fucking REALLY?!
Migrations of Actions and Copilot to Azure completed in 2024.
Pages and Packages completed in 2025.
Core platform and databases began in October 2025 and are in progress, with traffic split between the legacy Github data center and Azure.
That's probably partly why things have got increasingly flaky - until they finish there'll be constant background cognitive load and surface area for bugs from the fact everything (especially the data) is half-migrated
You'd think so, and we don't know about today's incident yet, but recent Github incidents have been attributed specifically to Azure, and Azure itself has had a lot of downtime recently that lasts for many hours.
True, the even simpler explanation is what they've migrated to is itself just unreliable
This has those Hotmail migration vibes off the early 2000s.
And yet, somehow my wife still has a hotmail.com address 25 years later.
Is there any reason why Github needs 99.99% uptime? You can continue working with your local repo.
Many teams work exclusively in GitHub (ticketing, boards, workflows, dev builds). People also have entire production build systems on GitHub. There's a lot more than git repo hosting.
It's especially painful for anyone who uses Github actions for CI/CD - maybe the release you just cut never actually got deployed to prod because their internal trigger didn't fire... you need to watch it like a hawk.
I waited 2.5 hours for a webhook from the registry_packages endpoint today.
I'm grateful it arrived, but two and half hours feels less than ideal.
I'm a firm believer that almost nothing except public services needs that kind of uptime... We've introduced ridiculous amounts of complexity to our infra to achieve this and we've contributed to the increasing costs of both services and development itself (the barrier of entry for current juniors is insane compared to what I've had to deal with in my early 20s).
What do you mean by public services?
All kinds of companies lose millions of dollars of revenue per day if not hour if their sites are not stable.... apple, amazon, google, Shopify, uber, etc etc.
Those companies have decided the extra complexity is worth the reliability.
Even if you're operating a tech company that doesn't need to have that kind of uptime, your developers probably need those services to be productive, and you don't want them just sitting there either.
By public services I mean only important things like healthcare, law enforcement, fire department. Definitely not stores and food delivery. You can wait an hour or even a couple of hours for that.
> Those companies have decided the extra complexity is worth the reliability.
Companies always want more money and yes it makes sense economically. I'm not disagreeing with that. I'm just saying that nobody needs this. I grew up in a world where this wasn't a thing and no, life wasn't worse at all.
Eh, if I'm paying someone to host my git webui, and they are as shitty about it as github has been recently, I'd rather pay someone else to host it or go back to hosting it myself. It is not absolutely required, but it's a differentiating feature I'm happy to pay for
As an example, Go build could fail anywhere if a dependency module from Github is not available.
Any module that is properly tagged and contains an OSS license gets stored in Google's module cache indefinitely. As long as it was go-get-ed once before, you can pull it again without going to GitHub (or any other VCS host).
Does go build not support mirrors so you can define a fallback repository? If not, why?
Lots of teams embraced actions to run their CI/CD, and GitHub reviews as part of their merge process. And copilot. Basically their SOC2 (or whatever) says they have to use GitHub.
I’m guessing they’re regretting it.
> Basically their SOC2 (or whatever) says they have to use GitHub
Our SOC2 doesn't specify GitHub by name, but it does require we maintain a record of each PR having been reviewed.
I guess in extremis we could email each other patch diffs, and CC the guy responsible for the audit process with the approval...
Every product vendor, especially those that are even within a shouting distance from security, has a wet dream: to have their product explicitly named in corporate policies.
I have cleaned up more than enough of them.
The Linux kernel uses an email based workflow. You can digitally sign email and add it to an immutable store that can be reviewed.
Does SOC2 itself require that or just yours? I'm not too familiar with SOC2 but I know ISO 27001 quite well, and there's no PR specific "requirements" to speak of. But it is something that could be included in your secure development policy.
Yeah, it’s what you write in the policy.
And it's pretty common to write in the policy, because its pretty much a gimme, and lets you avoid writing a whole bunch of other equivalent quality measures in the policy.
The money i pay them is the reason
What if you need to deploy to production urgently...
I think this is being downvoted unfairly. I mean, sure, as a company accepting payment for services, being down for a few hours every few months is notably bad by modern standards.
But the inward-looking point is correct: git itself is a distributed technology, and development using it is distributed and almost always latency-tolerant. To the extent that github's customers have processes that are dependent on services like bug tracking and reporting and CI to keep their teams productive, that's a bug with the customer's processes. It doesn't have to be that way and we as a community can recognize that even if the service provider kinda sucks.
There are still some processes that require a waterfall method for development, though. One example would be if you have a designer, and also have a front-end developer that is waiting for a feature to be complete to come in and start their development. I know on HN it's common for people to be full-stack developers, or for front-end developers to be able to work with a mockup and write the code before a designer gets involved, but there are plenty of companies that don't work that way. Even if a company is working in an agile manner, there still may come a time where work stalls until some part of a system is finished by another team/team-member, especially in a monorepo. Of course they could change the organization of their project, but the time suck of doing that (like going with microservices) is probably going to waste quite a bit more time than how often GitHub is down.
> There are still some processes that require a waterfall method for development, though
Not on the 2-4 hour latency scale of a GitHub outage though. I mean, sure, if you have a process that requires the engineering talent to work completely independently on day-plus timescales and/or do all their coordination offline, then you're going to have a ton of trouble staffing[1] that team.
But if your folks can't handle talking with the designers over chat or whatnot to backfill the loss of the issue tracker for an afternoon, then that's on you.
[1] It can obviously be done! But it's isomorphic to "put together a Linux-style development culture", very non-trivial.
Being snapshot-based. Git has some issues being distributed in practice since the patch order matter which means you basically need to have some centralized authoritative server in most cases with more than 2 folks to resolve the order of patches for meaningful uses as the hash is used in so many contexts.
That's... literally what a merge collision is. The tooling for that predates git by decades. The solutions are all varying levels of non-trivial and involve tradeoffs, but none of them require 24/7 cloud service availability.
Are you kidding? I need my code to pass CI, and get reviewed, so I can move on, otherwise the PRs just keep piling. You might as well say the lights could go out, you can do paperwork.
> otherwise the PRs just keep piling
Good news! You can't create new PRs right now anyway, so they won't pile.
When in doubt - schedule a meeting about how you're unable to do work to keep doing work!
Yeah, I'm literally looking at GitLab's "Migrate from GitHub" page on their docs site right now. If there's a way to import issues and projects I could be sold.
If you're considering moving away from github due to problems with reliability/outages, then any migration to gitlab will not make you happy.
> If there's a way to import issues and projects I could be sold.
That is what that feature does. It imports issues and code and more (not sure about "projects", don't use that feature on Github).
Maybe it's be reasonable to script using the glab and gh clis? I've never tried anything like that, but I regularly use the glab cli and it's pretty comprehensive.
No need – it imports pretty much anything you can reliably import from GitHub, including issues and PRs (with comments): https://docs.gitlab.com/user/project/import/github/#imported...
It’s not so much an op’s issue as an architecture and code quality issue. If you have ever dug into the GitHub enterprise self hosted product you get an idea of the mess.
This is obviously empty speculation, but I wonder if the mindless rush to AI has anything to do with the increase in outages we've seen recently.
Or maybe the mindless rush to host it in azure?
Or both!
It does. I work at Amazon and I can see the increase in outages or major issues since AI has been pushed.
This is Microsoft. They forced a move to Azure, and then prioritized AI workfloads higher. I'm sure the traing read workloads on GH are nontrivial.
They literally have the golden goose, the training stream of all software development, dependencies, trending tool usage.
In an age of model providers trying train their models and keep them current, the value of GitHub should easily be in the high tens of billions or more. The CEO of Microsoft should be directly involved at this point, their franchise at risk on multiple fronts now. Windows 11 is extremely bad. GitHub going to lose their foundational role in modern development shortly, and early indications are that they hitched their wagon to the wrong foundational model provider.
I viscerally dislike github so much at this point. I don't know how how they come back from this. Major opportunity for competitor here to come around and with ai native features like context versioning