As a contractor who is on an oncall schedule. I have never worked in a company that treats oncall as a very serious business. I only worked in 2 companies that need oncall so I’m biased. On paper, they both say it is serious and all SLA stuffs were setup, but in reality there is not enough support.

The problem is, oncall is a full-time business. It takes full attention of the oncall engineer, whether there is an issue or not. Both companies simply treat oncall as a by-product. We just had to do it so let’s stuff it into the sprint. The first company was slightly more serious as we were asked to put up a 2-3 point oncall task in JIRA. The second one doesn’t even do this.

Neither company really encourages engineers to read through complex code written by others, even if we do oncall for those products. Again, the first company did better, and we were supposed to create a channel and pull people in, so it’s OKish to not know anything about the code. The second company simply leaves oncall to do whatever they can. Neither company allocates enough time for engineers to read the source code thoroughly. And neither has good documentation for oncall.

I don’t know the culture of AWS. I’d very much want to work in an oncall environment that is serious and encourages learning.

When I was an SRE at Google our oncall was extremely serious (if the service went down, Google was unable to show ads, record ad impressions, or do any billing for ads). It was done on a rotation, lasted 1 week (IIRC it was 9AM-9PM, we had another time zone for the alternate 12 hours). The on-call was empowered to do pretty much anything required to keep the service up and running, including cancelling scheduled downtimes, pausing deployment updates, stop abusive jobs, stop abusive developers, and invoke an SVP if there was a fight with another important group).

We sent a test page periodically to make sure the pager actually beeped. We got paid extra for being in the rotation. The leadership knew this was a critical step. Unfortunately, much of our tooling was terrible, which would cause false pages, or failed critical operations, all too frequently.

I later worked on SWE teams that didn't take dev oncall very seriously. At my current job, we have an oncall, but it's best effort business hours only.

>empowered to do pretty much anything required to keep the service up and running,

Is that really uncommon? I've been on call for many companies and many types of institutions and never been told once I couldn't do something to bring a system up that I can recall at least. Its kinda the job?

On call seriousness should be directly proportional to pay. Google pays. If smallcorp want to pay me COL I'll be looking at that 2AM ticket at 9AM when I get to work.

That’s pretty good. Our oncall is actually 24-hour for one week. On paper it looks very serious but even the best of us don’t really know everything so issues tend to lag to the morning. Neither do we get any compensation for it. Someone got a bad night and still need to logon next day. There is an informal understanding to relax a bit if the night is too bad, though.

I did 24hr-for-a-week oncall for 10+ years, do not recommend.

12-12 rotation in SRE is a lot more reasonable for humans

Unfortunately 24hr-for-a-week seems to be default everywhere nowdays, its just not practical for serious type businesses. It just an indicator of how important is the UPTIME for a company.

I agree. It sucks. And our schedule is actually 2 weeks in every five. One is secondary and the other is primary.

Handling my first non-prod alert bug as the oncall at Google was pretty eye opening :)

It was a good lesson in what a manicured lower environment can do for you.

Amazon generally treats on call as a full time job. Generally engineers who are on call are expected to only be on call. No feature work.

It's very team/org dependent and I would say that's generally not the case. In 6 years I have only had 1 team out of 3 where that was true. The other two teams I was expected to juggle feature work with oncall work. Same for most teams I interacted with.

Interesting, I've been here nearly that long and every team I've worked with its generally the way I described. Do engineers always do that? No. But it is the expectation

That's actually pretty good.