> The TTL for NSEC records are presumably way lower than the TTL for the DS records.
Possibly. It was still an outage they had to wait out the TTL for, due to the design of DNSSEC.
> It’s theoretically possible that it would not have worked for all cases, but that is, in my experience, very unlikely.
This is completely unsubstantiated speculation on your part.
> The bug seems to me to have been reasonably easy to mitigate, but their problem was that Slack did not know what they were doing.
It is indeed reasonably easy to Monday-morning quarterback someone else's outage and blame operators for the sharp edges around poorly designed protocols.
> 1. That number used to be zero, as tptacek liked to point out.
Cool. so, at this rate, in another 100 years or so we should be at 50% adoption.
> 2. The huge operators often have fundamentally different security priorities than regular companies and users.
Priorities like uptime?
> 3. People said the same about IPv6 and SSL, which were also very slow to adopt. But they are all climbing
1) people started rolling IPv6 out once v4 addresses got scarce. There is no such compelling event to drive DNSSEC adoption. 2) SSL is easy to roll out and provides compelling security benefits. It is also exceedingly unlikely in practice to blow up in your face and result in run-out-the-clock outages -- unlike DNSSEC.
> This is completely unsubstantiated speculation on your part.
Do you have any support for your assumption that the wildcard record was vital and practically impossible to replace with regular records?
> It is indeed reasonably easy to Monday-morning quarterback someone else's outage and blame operators for the sharp edges around poorly designed protocols.
When Slack, being a large company presenting themselves as proficient in tech, make a tech mistake so bad that they lock themselves out if the internet for an entire day, a mistake even I know not to make, then I get to criticize them.