This is gold, thank you. The “easy to reason about” part is exactly what I’m going for.

A couple quick questions if you don’t mind:

Roughly what volume are you pushing per device (MB/day or events/sec), and what’s your typical offline window?

What’s your biggest failure mode today: disk-full/rotate policy, encryption key handling, replay storms on reconnect, or Lambda fanout/cost?

I’m thinking Ayder could replace the “rotate → ship” backend with a durable local log + priority queues + replay, but you’re right that the hardest part is the policy (what to drop first, how to bound disk, and how to preserve critical streams). If you’re open, I’d love to learn what heuristics you ended up with.