OpenAI API also supports defer_loading https://developers.openai.com/api/docs/guides/tools-tool-sea...

And it's not actually necessary for it to exist at the API level. It's a pattern. Making it API-side is just an optimization.

To do it client-side: 1. Define a single tool, tool_search 2. List the names of your deferred tools in context (or tool_search's description) 3. When tool_search is called, match the query against the tool names (or names + descriptions) 4. Append the matched tool def to the context in a new <system>-esque tag

Claude Code (as of the leak) does this client side. You can even see the custom matching function and A/B tests about whether to include the descriptions.

Whether or not that tool definition comes from MCP or a local definition is kind of beside the point.