Seems that this would be useful for subagents aswell. You could still allow an agent down the line to inspect the thinking traces/steps of a subagent, by creating a mapping of the content. Thus keeping it compressed but accesible if requested.
Seems that this would be useful for subagents aswell. You could still allow an agent down the line to inspect the thinking traces/steps of a subagent, by creating a mapping of the content. Thus keeping it compressed but accesible if requested.
Our system uses sub-agents as a core part of its architecture.
That terminology can be confusing, because in other cases (and sometimes in our own architecture, like when executing thousands of operations via MAP) a sub-agent may be a smaller model given less complex individual tasks.
But the core mechanism we use for simulating unlimited context is to allow the main model to spin up instances of itself (sub-agents) with the previously summarized portion of the context expanded into its full, uncompressed state.
Expanding summaries into full text in sub-agents rather than the main thread is a critical part of our architecture, because it prevents the main context window from filling up.