I'm aware that one can't feed millions of messages into an LLM all at once. The only way to do this now is to use a RAG approach. But RAG approach has pros and cons and can miss crucial information. I think context window still matters a lot. The bigger the window, the more information you can feed in and the quality of answer should increase.
The point I'm trying to make is that increase context window will require more compute. Hence, we could still just be in the beginning of the compute/AI boom.
We might be even earlier — the 90s was a famous boom with a fast bust, but to me this feels closer to the dawn of the personal computer in the late 70s and early 80s: we can automate things now that were impossible to automate before. We might have a long time before seeing diminishing returns.