Not OP, but I routinely load 150k tokens into context. A full sub-package to work on, select other files in the monorepo, e.g. front-end visualization and back-end data loader. Then work some 150k tokens, then start again.

At the end, cache hit rate is like 99.5% if Novita is not having issues.

For official DeepSeek API, 99.9% or something.

Custom harness that never compacts or otherwise doctors the history.

Those numbers make sense to me...120 million input tokens is like 120 sessions of hitting the full context limit, which seems like a lot to me though