Recently there was a submission (https://news.ycombinator.com/item?id=45840088) breaking down how agents are basically just a loop of querying a LLM, sometimes receiving a specially-formatted (using JSON in the example) "request to use a tool", and having the main program detect, interpret and execute those requests.
What do "skills" look like, generically, in this framework?
Before the first loop iteration, the harness sends a message to the LLM along the lines of.
<Skills>
</Skills>The harness then may periodically resend this notification so that the LLM doesn't "forget" that skills are available. Because the notification is only name + description + file, this is cheap r.e tokens. The harness's ability to tell the LLM "IMPORTANT: this is a skill, so pay attention and use it when appropriate" and then periodically remind them of this is what differentiates a proper Anthropic-style skill from just sticking "If you need to do postgres stuff, read skills/postgres.md" in AGENTS.md. Just how valuable is this? Not sure. I suspect that a sufficiently smart LLM won't need the special skill infrastructure.
(Note that skill name is not technically required, it's just a vanity / convenience thing).
> The harness's ability to tell the LLM "IMPORTANT: this is a skill, so pay attention and use it when appropriate" and then periodically remind them of this is what differentiates
... And do we know how it does that? To my understanding there is still no out-of-band signaling.
A lot of tools these days put an extra <system> message into the conversation periodically that the user never sees. It fights against context rot and keeps important things fresh.
[dead]
The agent can selectively loads one or more of the "skills", which means it'll pull it's prompt once it decided that it should be loaded, and the skill can have accompanying scripts that the prompt also describes to the LLM.
So it's just like a standard way to bring in prompts/scripts to the LLM with support from the tooling directly.