Did you consider MCP sampling to avoid requiring your own LLM access? (for the clients that support it of course, but I think it's important and will become standard anyway)

Not totally sure I understand, but if you're talking about the snapshot command which requires an API key we initially had it spinning up a tmux session to analyze the snapshot instead of using the API. But we switched it to use the API for 2 reasons:

1. Noticed that the API was a couple seconds faster than spinning up the coding agent

2. Spinning up a separate agent you can't guarantee its behavior, and we wanted to enforce that only a single LLM call was run to read the snapshot and analyze the selector. You can guarantee this with an API call but not with a local coding agent

Sorry yeah it was a big vague, I was thinking about creating a Libretto MCP since it's a/the standard way to share AI tooling nowadays and that would make it usable in more contexts.

In that case, the protocol has a feature called "sampling" that allow the MCP server (Libretto) to send completion requests to the MCP client (the main agent/harness the user interacts with), that means that Libretto would not need its own LLM API keys to work, it would piggyback on the LLMs configured in the main harness (sampling support "picking" the style of model you prefer too - smart vs fast etc).