My friend and I are building Rover, a search engine for petabytes of raw logs in Azure: https://roverhq.io/ The core idea is to strip away the "indexing tax" that usually makes large-scale logging so expensive. Instead of moving massive amounts of data to heavy compute nodes, we’ve designed the engine to surgically fetch only the specific bytes needed to answer a query directly from storage. We’ve spent a lot of time squeezing every bit of efficiency out of the hardware. By shifting the heavy lifting from memory-hungry string parsing to hardware-accelerated bitwise math, we can scan 1TB of data in about 1.2s for roughly $0.01. It’s been a fun challenge to see how far we can push the physics of the network and CPU to make searching massive amounts of raw data feel instantaneous.
If you’re dealing with similar scaling headaches and want to chat about it, my email is dhruv [at] roverhq.io or you can find more at https://roverhq.io/.