Yes, I have already made deliberate cache decisions and plan to do more once it's working the way I imagine. I think the trimmed down context will have way bigger effect than the cache stuff, though.

As far as I understand, it's caches are not a "next-turn" thing, but a ttl thing.

I made the "retrieve" tool, which is what pulls back previously removed content, append to the conversation rather than putting it back where it previously was. But it's a but premature to really know if that's a real optimization.