Disclosure: I built an OSS single-binary, HTTP-native durable event log aimed at this edge “store-and-forward + replay” problem. Repo: github.com/A1darbek/ayder

If anyone is open to a tiny design-partner pilot (30–60 min): run docker compose → ingest some telemetry → simulate outage (kill -9 / disconnect) → restart → verify replay + zero loss. I’ll do white-glove onboarding and turn the learnings into a short case study (can be anonymous).