Hacker News

You didn't quote the interesting part:

> our implementation is it only prunes calls from > 3 user messages ago, if context is > 40K, and only if there's at least 20K tokens to be removed

Seems reasonable to me and explains why I can have long sessions (way longer than with zed agents) while still hitting cache. Opencode is just missing per-provider TTL.