Chunking is still relevant, because you want your tool calls to return results specific to the needs of the query.

If you want to know "how are tartans officially registered" you don't want to feed the entire 554kb wikipedia article on Tartan to your model, using 138,500 tokens, over 35% of gpt-5's context window, with significant monetary and latency cost. You want to feed it just the "Regulation>Registration" subsection and get an answer 1000x cheaper and faster.

but you could. For that example, you could just use a much cheaper model since it's not that complicated a question, and just pass the entire article. Just use gemini flash for example. Models will only get cheaper and context windows only get bigger