There’s a few problems with that.

1. The closer the context gets to full the worse it performs.

2. The more context it has the less it weights individual items.

That is Claude might learn you hate long functions and add a line about short functions. When that is the only thing in the function it is likely to follow other very closely. But when it’s 1 piece of such longer context, it is much more likely to ignore it.

3. Tokens cost money even you are currently being subsidized.

4. You have no idea how new models and new system prompt will perform with your current memory.md file.

5. Unlike learning something yourself, anything you teach Claude is likely to start being controlled by your employer. They might not let you take it with you when you go.

> 3. Tokens cost money even you are currently being subsidized.

keep in mind that those 50k memory tokens would likely be cached after the first run and thus significantly cheaper