TFA itself has an incorrect DOCTYPE. It’s missing the whitespace between "DOCTYPE" and "html". Also, all spaces between HTML attributes where removed, although the HTML spec says: "If an attribute using the double-quoted attribute syntax is to be followed by another attribute, then there must be ASCII whitespace separating the two." (https://html.spec.whatwg.org/multipage/syntax.html#attribute...) I guess the browser gets it anyway. This was probably automatically done by an HTML minifier. Actually the minifier could have generated less bytes by using the unquoted attribute value syntax (`lang=en-us id=top` rather than `lang="en-us"id="top"`).
Edit: In the `minify-html` Rust crate you can specify "enable_possibly_noncompliant", which leads to such things. They are exploiting the fact that HTML parsers have to accept this per the (parsing) spec even though it's not valid HTML according to the (authoring) spec.
Maybe a dumb question but I have always wondered, why does the (authoring?) spec not consider e.g. "doctypehtml" as valid HTML if compliant parsers have to support it anyway? Why allow this situation where non-compliant HTML is guaranteed to work anyway on a compliant parser?
It's considered a parse error [0]: it basically says that a parser may reject the document entirely if it occurs, but if it accepts the document, then it must act as if a space is present. In practice, browsers want to ignore all parse errors and accept as many documents as possible.
[0] https://html.spec.whatwg.org/multipage/parsing.html#parse-er...
> a parser may reject the document entirely if it occurs
Ah, that's what I was missing. Thanks! The relevant part of the spec:
> user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.
(https://html.spec.whatwg.org/multipage/parsing.html#parse-er...)
Because there are multiple doctypes you can use. The same reason "varx" is not valid and must be written "var x".
Same reason <ahref="/page.html"> is invalid.
For anyone else furiously going back and forth between TFA and this comment: they mean the actual website of TFA has these errors, not the content of TFA.
Sorry for confusing you!