> They're serving files to every IP address (no matter what machine is attached to it) that is capable of opening a socket and sending GET.

Legally in the US a “public” web server can have any set of usage restrictions it feels like even without a login screen. Private property doesn’t automatically give permission to do anything even if there happens to be a driveway from the public road into the middle of it.

The law cars about authorized access not the specific technical implementation of access. Which has caused serious legal trouble for many people when they make seemingly reasonable assumptions that say access to someURL/A12.jpg also gives them permission to someURL/A13.jpg etc.

...but the matter of "what the law cares about" is not really the point of contention here - what matters here is what happens in the real world.

In the real world, these requests are being made, and servers are generating responses. So the way to change that is to change the logic of the servers.

> In the real world, these requests are being made, and servers are generating responses.

Except that’s not the end of the story.

If you’re running a scraper and risking serious legal consequences when you piss off someone running a server enough, then it suddenly matters a great deal independent of what was going on up to that point. Having already made these requests you’ve just lost control of the situation.

That’s the real world we’re all living in, you can hope the guy running a server is going to play ball but that’s simply not under your control. Which is the real reason large established companies care about robots.txt etc.