As per the README responses are not checked by the tool.
If you only received ~80KB for 10M requests the server probably terminated the connection early before processing all requests (like nginx does after 1k requests on one TCP keepalive socket if you use the default configuration). Check the responses-XXX.txt files to see what happened. You then need to either adjust the server configuration or use multiple sockets with the max keepalive requests the server can handle.
If you run this tool on the same machine as the server process, the requests file is likely held in the file system cache (RAM and shared by all threads) and every recv() call by the server under test is essentially a memory copy at the speed of the machine's memory bandwidth, which can easily be >>10GB/s or millions of requests per second per connection. This is also way faster than typical servers can even parse HTTP/1.
But highly optimized servers running straight HTTP/1 without TLS or backend logic on multiple threads should absolutely hit multiple millions of requests per second with this tool. Researching how fast an HTTP/1 server can get was the reason I made this in the first place.
Ah, I think I understand now, we are bombarding the server in a H/1.1 pipelined approach so basically not waiting for server response to send the next request and theoretically using infinite pipelined depth as we never really check any response and simply jam the server with as many requests as possible. That would explain the results - the issue with that is that we don't really check if the server is able to process the extremely high number of requests and most of them are likely just lost and never processed by the server - so we are basically measuring the tool output capability not the server performance.
I can see in the results .txt that only a small portion of the sent requests actually result in a response, also not every server supports H/1.1 pipeline so they will flush once per request (typical workload), servers that support pipeline will have way higher throughput
Exactly. For GET reqeuests HTTP/1 conformant servers must support pipelining or close the connection.
So this is the best way to generate extreme load and stress-test the internal architecture of an HTTP/1 server. But yeah the sendfile approach only works for this kind of testing and not in the generic case.