I wish every instruction and response had a enable/disable checkbox so that I can disable parts of the conversation in such a way that it is excluded from the context.
Let's say I submit or let it create a piece of code, and we're working on improving it. At some point I want to consider the piece of code to be significantly better that what I had initially, so all those initial interactions containing old code could be removed from the context.
I like how Google AI Studio allows one to delete sections and they are then no longer part of the context. Not possible in Claude, ChatGPT or Gemini, I think there one can only delete the last response.
Maybe even AI could suggest which parts to disable.
> I like how Google AI Studio allows one to delete sections and they are then no longer part of the context. Not possible in Claude, ChatGPT or Gemini, I think there one can only delete the last response.
I have the same peeve. My assumption is the ability to freely edit context is seen as not intuitive for most users - LLM products want to keep the illusion of a classic chat UI where that kind of editing doesn't make sense. I do wish ChatGPT & co had a pro or advanced mode that was more similar to Google AI Studio.
/compact does most of that, for me at least
/compact we will now work on x, discard y, keep z
the trouble with compact is that no one really knows how it works and what it does. hence, for me at least, there is just no way I would ever allow my context to get there. you should seriously reconsider ever using compact (I mean this literally) - the quality of CC at that point is order of magnitute significantly worse that you are doing yourself significant disservice
When you hit ^O after compact runs (or anytime), it tells you exactly what compact did, so it isn’t that mysterious
if you actually hit the compact (you should never be there no matter what but for the sake of argument) more often than not you'll see CC going off the rails immediately after compacting is done. it even doesn't know what it did let alone you :)
You mean you never stay in a CC session long enough to even see the auto compaction warning?
100% - read this - https://blog.nilenso.com/blog/2025/09/15/ai-unit-of-work/
I exit and restart CC all the time to get a “Fresh perspective on the universe as it now is”.
Isn't /clear enough to do that? I know some permissions survive from previous sessions, but it served me well
The one time I tried I felt like /clear may have dropped all my .claude files as well, but I didn’t look at it closely.
I went with an extra CURRENT.md for whatever extra info that might be useful for what I am working on and frequently /clear after each very small task. /compact is rarely used unless there are reasons to maintain a summary on what its working on.
Each new prompt involves asking Claude to read CURRENT.md for additional context.
I'm not sure if I should move this to CLAUDE.md but the stuff in CURRENT.md are very short term information that gets useless after a while.
---
There is one time where Claude entirely messed up the directory when moving things around and it sort of stuck in a weird "panic" loop in chat for quite a while (involving oh no / oh dear in chat), nothing git can't fix, but I suspect is due to the directory info in CLAUDE.md getting stale. Ever since then I moved things that might get stale to a separate file and frequently keep it updated/trimmed as needed.
I do as well, with Codex though, but OP is asking for more fine grained control of what's in context and what can be thrown away.
You can simulate this of course by doing the reverse and maintaining explicit memory via a markdown files or w/e of what you want to keep in context. I could see wanting both, since a lot of the time it would be easier to just say "forget that last exploration we did" while still having it remember everything from before that. Think of it like an exploratory twig on a branch that you don't want to keep.
Ultimately I just adapt by making my tasks smaller, using git branches and committing often, writing plans to markdown, etc.
I kind of do this, semi-manually when using the web chat UIs (which happens less and less). I basically never let the conversations go above two messages in total (one message from me + one reply, since the quality of responses goes down so damn quick), and if anything is wrong, I restart the conversation and fix the initial prompt so it gets it right. And rather than manually writing my prompts in the web UIs, I manage prompts with http://github.com/victorb/prompta which makes it trivial to edit the prompts as I find out the best way of getting the response I want, together with some simple shell integrations to automatically include logs, source code, docs and what not.
I work similarly. I keep message rounds short (1-3) and clear often. If I have to steer the conversation too much, I start over.
I built a terminal tui to manage my contexts/prompts: https://github.com/pluqqy/pluqqy-terminal
I think the main issue with removing certain previous responses from context would be that you no longer hit the cache for a large part of your chat history, which makes responses much more expensive and slow.
Its faster and cheaper (in most cases) to leave the history as is and hit the cache.
The SolveIt tool [0] has a simple but brilliant feature I now want in all LLM tools: a fully editable transcript. In particular, you can edit the previous LLM responses. This lets you fix the lingering effect of a bad response without having to back up and redo the whole interaction.
[0] https://news.ycombinator.com/item?id=45455719
FWIW, in Claude Desktop you can edit previous user context and Claude will fork the conversation from that point. I know it's not quite what you as asking for, but it's something.
There are 3rd-party chat interfaces out there that have much better context controls if it matters enough for you that you're willing to resort to direct API usage.
Related, it feels like AI Studio is the only mainstream LLM frontend that treats you like an adult. Choose your own safety boundaries, modify the context & system prompt as you please, clear rate limits and pricing, etc. It's something you come to appreciate a lot, even if we are in the part of the cycle where Google's models aren't particularly SOTA rn
Not sure if msty counts as mainstream but it has so many quality of life enhancements it’s bonkers.
How are they not SOTA? They're all very similar with ChatGPT being the worst (for my use case anyway). Like adding lambdas and random c++ function calls into my vulkan shaders.
Gemini 2.5 Pro is the most capable for my usecase in Pytorch as well. Large context and much better instruction following for code edits make a big difference.
Gemini 2.5 pro is generally non-competitive with GPT-5-medium or Sonnet 4.5.
But never fear, Gemini 3.0 is rumored to be coming out Tuesday.
The random people tweets I've seen said Oct 9th which is Thursday. I suppose we will know when we know.
based on what? LLM benchmarks are all bullshit, so this is based on... your gut?
Gemini outputs what I want with a similar regularity as the other bots.
I'm so tired of the religious thinking around these models. show me a measurement.
> LLM benchmarks are all bullshit
> show me a measurement
Your comment encapsulates why we have religious thinking around models.
Please tell me this comment is a joke.
They introduced removing from the end of the stack but not the beginning
I’ll be releasing something shortly that does this, plus more.