Services like Cloudflare and Twilio have so many POPs globally that one or more always have an outage going on. Then there's the question of whether it's a major outage or a minor outage. Even though major status page providers like Atlassian and Incident.io have public status APIs (Cloudflare uses Atlassian), it takes more than just parsing them to determine what is "down" and at what granularity.
I run an outage detection service - and some of these issues, like parsing hundreds of - sometimes undocumented - status APIs, make for an interesting engineering problem.
With these guys you get into a weird world of "is it them, us, or upstream of both of us" all the time. I had been using Twilio's telco partner maintenance notifications as a way of figuring out if someone like Orange was responsible for a bunch of French end points independent of Twilio had network degradation.
Services like Cloudflare and Twilio have so many POPs globally that one or more always have an outage going on. Then there's the question of whether it's a major outage or a minor outage. Even though major status page providers like Atlassian and Incident.io have public status APIs (Cloudflare uses Atlassian), it takes more than just parsing them to determine what is "down" and at what granularity.
I run an outage detection service - and some of these issues, like parsing hundreds of - sometimes undocumented - status APIs, make for an interesting engineering problem.
With these guys you get into a weird world of "is it them, us, or upstream of both of us" all the time. I had been using Twilio's telco partner maintenance notifications as a way of figuring out if someone like Orange was responsible for a bunch of French end points independent of Twilio had network degradation.