You can easily persist agent memories in a markdown file though.

And the memento guy had tattoos of key information. That didn’t make it so he didn’t have memory loss.

Pretty good metaphor.

Limited space to work with, highly context dependent and likely to get confused as you cover more surface area.

Which it will start ignoring after two or three messages in the session.

and you'll blow the context over time and send to the LLM sanitorium. It doesn't fit like the human brain can.

If a junior fucks production that will have extroadinary weight because it appreciates the severity, the social shame and they will have nightmares about it. If you write some negative prompt to "not destroy production" then you also need to define some sort of non-existing watertight memory weighting system and specify it in great detail. Otherwise the LLM will treat that command only as important as the last negative prompt you typed in or ignore it when it conflicts with a more recent command.

> and you'll blow the context over time and send to the LLM sanitorium. It doesn't fit like the human brain can.

The LLM did have this capability at training time, but weights are frozen at inference time. This is a big weakness in current transformer architectures.

Yup, and the agent will happily ignore any and all markdown files, and will say "oops, it was in the memory, will not do it again", and will do it again.

Humans actually learn. And if they don't, they are fired.

To me it sounds like a tooling problem. OP seems to be trying to use probabilistic text systems as if they enforce rules, but rule enforcement should really live outside the model. My sense is that there was a failure to verify the agent's intent.

The tooling that invokes the model should really define some kind of guardrails. I feel like there's an analogy to be had here with the difference between an untyped program and a typed program. The typed program has external guardrails that get checked by an external system (the compiler's type checker).

What tooling? It's a probabilistic text generator that runs in a black box on the provider's server. What tooling will have which guardrails to make sure that these scattered markdown files are properly injected and used in the text generation?

[deleted]

That's not learning.