realistically, you can get a project started within low enough tokens, to have a long enough conversation to generate something that looks like 1.0. eventually you will reach a point where every request becomes more expensive and caching doesn't help. you'll have to truncate/prune/hoist the context however you do that, summarize is what i do and i get creative. i have absolutely no idea how anyone using agents is producing anything maintainable over a long iterative period.

this is llm bitcoin moment, you will find them raise the price so high that running these agents like you are used to now will leave you with no pants on. you need to aim for minimum context, not stuff it with everything irrelevant.