This is quite an interesting article for its omissions.

I remember the great FastCGI vs. SCGI vs. HTTP wars: I was founding a Web2.0 startup right at the time these technologies were gaining adoption, and so was responsible for setting up the frontend stack. HTTP won because of simplicity: instead of needing to introduce another protocol into your stack, you can just use HTTP, which you already needed to handle at the gateway. Now all sorts of complex network topologies became trivial: you could introduce multiple levels of reverse proxies if you ran out of capacity; you could have servers that specialized in authentication or session management or SSL termination or DDoS filtering or all the other cross-cutting concerns without them needing to know their position in the request chain; and you could use the same application servers for development, with a direct HTTP connection, as you did in production, where they'd sit behind a reverse proxy that handled SSL and authentication and abuse detection.

It also helped that nginx was lots faster than most FastCGI/SCGI modules of the time, and more robust. I'd initially setup my startup's stack as HTTP -> Lighttpd -> FastCGI -> Django, but it was way slower than just using nginx.

The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP. It's the idea that the network and its protocols should be agnostic to what's being transmitted, and all application logic should be in nodes of the network that filter and redirect packets accordingly. This has been a very powerful principle and shouldn't be discarded lightly.

The observation the article makes is that for security, it's often better to follow the Principle of Least Privilege [2] rather than blindly passing information along. Allowlist your communications to only what you expect, so that you aren't unwittingly contributing to a compromise elsewhere in the network.

And the article is highlighting - not explicitly, but it's there - the tension between these two principles. E2E gives you flexibility, but with flexibility comes the potential for someone to use that flexibility to cause harm. PoLP gives you security, but at the cost of inflexibility, where your system can only do what you designed it to do and cannot easily adapt to new requirements.

[1] https://en.wikipedia.org/wiki/End-to-end_principle

[2] https://en.wikipedia.org/wiki/Principle_of_least_privilege

> The use of HTTP was basically the web equivalent of the End-to-End Principle [1] for TCP/IP.

I don't think the analogy works, not in the context of connection caching and multiplexing. An intermediate gateway multiplexing multiple HTTP requests over another HTTP channel, where that channel is the terminal leg directly to the listening service (i.e. requests aren't demultiplexed before hitting the application socket), fundamentally violates the logic to end-to-end in multiple ways. The analogy only works, if at all, if you preserve 1:1 connection symmetry.

All the reverse proxy exploits can be traced directly back to violating end-to-end.

If the analogy were true, then SMTP delivery across multiple MXs would be end-to-end as well. It's not, and you see many of the same issues as with reverse proxies, including messaging boundary desync'ing.

I guess you're trying to analogize HTTP requests as messages, but it falls apart almost immediately in the context of all the hairy details. The nature of TCP and HTTP semantics and the various concrete protocol details throws a wrench into things, with predictable consequences.

The end-to-end principle doesn't permit playing fast and loose with semantics. It demands very hard, rigid boundaries regarding state management and transport layering. That's the whole point. "Mostly" end-to-end is not end-to-end, not even a little bit.

The HTTP semantics are useful for anyone developing a web app but the wire protocol of HTTP itself is awful. Multiplexing didn’t arrive until HTTP 2.0 for example. So using HTTP for communication between a reverse proxy and a backend is very wasteful. There are security issues, such as when different parsers could even disagree on where the boundaries of a request ends.

Google for example has long wrapped HTTP into their own Stubby protocol between their frontline web servers and applications; it’s much faster and more featureful than using the HTTP wire protocol. It’s something that a typical company doesn’t need, but once the scale increases it becomes worthwhile to justify using a different wire protocol and developing all the tooling around that new wire protocol.

Won't argue with that, but it's a classic example of "Worse is better" [1]. It was simple and "good enough". Being ubiquitous is often more important than being efficient.

Most of the arguments for using HTTP reverse proxying over FastCGI or SCGI came down to ubiquity. It let you do things (like connect directly to your app servers with a web browser) that you couldn't do with FastCGI.

[1] https://dreamsongs.com/RiseOfWorseIsBetter.html

> Multiplexing didn’t arrive until HTTP 2.0 for example. So using HTTP for communication between a reverse proxy and a backend is very wasteful.

HTTP 2.0 multiplexing is tcp in tcp, it's asking for trouble. Just open more connections and let tcp be your multiplex. Depending on your connection rate, you can't really do 64k connections per frontend ip to each service ip:port, but if your rate isn't too high, 20-30k is feasible. most http based applications don't need or benefit from anywhere near that level of concurrency on frontend to backend. But if it's not enough, you can add more ips to the frontend or backend, or more ports to the backend.

I'm pretty sympathetic to the argument for FastCGI or similar as the protocol for frontend to backend though; having client set headers clearly separate from frontend set headers is very nice, and having clear agreement on message boundaries is of obvious value. Unless you're just doing a straight tcp proxy, in which case ProxyProtocol is good enough to transfer the original IPs and then pass data as-is.

> HTTP 2.0 multiplexing is tcp in tcp

It’s not. It doesn’t literally run another TCP congestion control algorithm inside a TCP tunnel. However I do agree that the implementation of multiplexing in HTTP/2.0 isn’t the best; it could have been better.

Don’t forget http pipelining!

I don't think that is how it happened. Yes, there was a SCGI/FastCGI chism, but it was mostly the Python ecosystem that used SCGI and the rest of the world was on FastCGI. Unless you were PHP, in which case you were on mod_php because it was an unmovable juggernaut.

Apache had a FastCGI module early on, but it received little love and was not that widely used. For many people, FastCGI was synonymous with nginx and lighttpd because these webservers came with support out of the box (nginx later got modules just as Apache).

When PHP finally got PHP-FPM, that gigantic ecosystem slowly started moving, sometime in the late 00s, and then FastCGI really took over. Almost. Because at the same time, the cloud era started and brought the "just use HTTP bro" mindset. Amazon has always used HTTP internally since the 90s and I would guess that probably carried over to AWS?

So nowadays, PHP still being a silent juggernaut, is now mostly on FastCGI, while most other have moved to cloud era standards and use HTTP. Go, for example, matured at this time and all the tutorials use straight HTTP proxying.

Yes, FastCGI is the much more robust standard, but you will encounter friction if you use it on your cloud native application. For regular servers and VMs it is still common.

To summarise, nginx!

There was a small window where everyone was trying to move off Apache/mod_perl etc, coming up with all sorts of ways to talk to the backend faster… but then nginx walked into the chat and killed the 10k problem along with having its easy but fancy upstreaming, and that was that.

Nginx made horizontal scaling a winch, and because of that rewriting your backend to handle HTTP and FastCGI etc was more effort than it was worth.

The end-to-end principle within a datacenter makes little sense and, as shown in the article, ends up enabling insecure behaviour.

It makes a lot of sense. Most large organizations are collections of independent teams, many of whom don't communicate with each other other than sending quarterly OKRs and status updates back to their VP. The E2E principle is what allows them to each do their thing, agnostic to what the other servers handling the request are doing, and then let higher levels of the organization reconfigure and provision the system based on the needs of the moment.

Large organizations have a well-known pattern for how to handle this tension between the E2E principle and the PoLP. It's a firewall. As per the E2E principle, this is a node in the system, usually placed near the outside, which is responsible for inspecting and sanitizing every request that enters the system. The input is untrusted external requests that may have arbitrary binary data. The output is the particular subset of HTTP that form valid requests for the server, sanitized to a minimal grammar and now trusted because you reject every packet that wasn't a well-formed request for your particular service. As an added bonus, now you can collect stats on who is sending these malformed requests, which lets you do things like DDoS protection or calling their ISP or contacting the FBI.

The article even admits this: the right solution to untrusted headers is to strip out everything you aren't explicitly expecting at the reverse proxy. If you didn't know True-Client-IP exists, don't pass it on. Allowlist and block everything by default, don't blocklist and allow everything by default.

Putting security-critical logic in proxies is a violation of the End-to-End Principle, not an example of it. That doesn't mean it's a bad thing; as ragall notes, the End-to-End Principle doesn't make sense here.

You're correct that if the proxy removes all unknown headers, you're safe (with HTTP/2). But that sounds extremely inconvenient - before your application can use a new header, you have to talk to the team who runs the proxy. And popular reverse proxy software doesn't do that by default so it remains a huge footgun for the unwary. All completely avoided with FastCGI.

Can you recommend a reverse proxy that supports white-listing of headers? nginx doesn't seem to.

Had to Google since it's been almost 20 years since I used nginx directly:

https://serverfault.com/questions/1033131/filter-to-only-pas...

Set proxy_pass_request_headers off, and then explicitly proxy_set_header each individual header you want to forward to the variable representing it in nginx config.

Or just use CloudFlare Tunnel, which gives you a bunch of other DDoS and abuse protection and keeps your app server off the public Internet.

Thank you, I somehow missed that.

> Most large organizations are collections of independent teams, many of whom don't communicate with each other other than sending quarterly OKRs and status updates back to their VP.

You describe an organizational failure, where different teams are allowed to do whatever they like instead of having a proper platform team, which can enforce security and standards for the benefit of interoperability. It's not an argument in favour of transparent end-to-end behaviour in datacenters.

What I dislike about nginx is ... the documentation. I find it virtually useless because of that.

Sadly httpd went the way of "let's make the configuration difficult"; I abandoned it when they suddenly changed the configuration format. I could have adjusted, but I switched to lighttpd (and also, past that point I let ruby autogenerate any configuration format, so technically I could return to httpd, but I don't want to - I think people who develop webservers, need to think about forcing people to adjust to any new format. If there is a "simple" decision to willy-nilly switch the configuration format, perhaps enable e. g. yaml-configuration in ADDITION, so that we don't have to go through new if-clause config statements suddenly).

I've been copying/modifying the same nginx config file for like 15 years

Little tweak here, little tweak there...

Nginx is extremely well represented in AI training material, so virtually every decent model - even locally hosted ones - can deliver you solid answers about its config settings.

Call me an old crusty Luddite if you will, after all you'd not be wrong, but…

I feel that if I can't work something out without asking a generative ML model, then I probably don't understand it well enough to properly assess the generated answer, and if I didn't understand the documentation well enough in the first place then “verify it against the documentation” is not a suitable answer, so I probably shouldn't be self-hosting that system on the open network.

It is quite irritating that the existence of generative models is apparently becoming an acceptable excuse for inadequate documentation. Rather than suggesting that I ask copilot when the documentation Azure is lacking, perhaps MS should as copilot to generate some better documentation (and have their human domain experts review it for correctness) so we have good documentation to work from. It strikes me that them using a bunch of LLM crunching power up-front is likely to me more efficient than a great many of us spending smaller amounts or resources each (many of us asking the same questions) at the point of consumption.