The Programmatic Tool Calling has been an obvious next step for a while. It is clear we are heading towards code as a language for LLMs so defining that language is very important. But I'm not convinced of tool search. Good context engineering leaves the tools you will need so adding a search if you are going to use all of them is just more overhead. What is needed is a more compact tool definition language like, I don't know, every programming language ever in how they define functions. We also need objects (which hopefully Programatic Tool Calling solves or the next version will solve). In the end I want to drop objects into context with exposed methods and it knows the type and what is callable on they type.
Why exactly do we need a new language? The agents I write get access to a subset of the Python SDK (i.e. non-destructive), packages, and custom functions. All this ceremony around tools and pseudo-RPC seems pointless given LLMs are extremely capable of assembling code by themselves.
Woah woah woah, you’re ignoring a whole revenue stream caused by deliberately complicating the ecosystem, and then selling tools and consulting to “make it simpler”!
Think of all the new yachts our mega-rich tech-bros could have by doing this!
my VS fork brings all the boys to the yard and they're like it's better than yours, damn right, it's better than yours
This is the most creative comment I've read on HN as of late.
<3
Thanks, most of the times when I do that people tell me to stop being silly and stop saying nonsense.
¯\_(ツ)_/¯
Exactly, instead of this mess, you could just give it something like .d.ts.
Easy to maintain, test etc. - like any other library/code.
You want structure? Just export * as Foo from '@foo/foo' and let it read .d.ts for '@foo/foo' if it needs to.
But wait, it's also good at writing code. Give it write access to it then.
Now it can talk to sql server, grpc, graphql, rest, jsonrpc over websocket, or whatever ie. your usb.
If it needs some tool, it can import or write it itself.
Next realisation may be that jupyter/pluto/mathematica/observable but more book-like ai<->human interaction platform works best for communication itself (too much raw text, I'd take you days to comprehend what it spit out in 5 minutes - better to have summary pictures, interactive charts, whatever).
With voice-to-text because poking at flat squares in all of this feels primitive.
For improved performance you can peer it with other sessions (within your team, or global/public) - surely others solved similar problems to yours where you can grab ready solutions.
It already has ablity to create tool that copies itself and can talk to a copy so it's fair to call this system "skynet".
The latest MCP specifications (2025-06-18+) introduced crucial enhancements like support for Structured Content and the Output Schema.
Smolagents makes use of this and handles tool output as objects (e.g. dict). Is this what you are thinking about?
Details in a blog post here: https://huggingface.co/blog/llchahn/ai-agents-output-schema
We just need simple language syntax like python and for models to be trained on it (which they already mostly are):
class MyClass(SomeOtherClass):
That is way more compact than the json schema out there. Then you can have 'available objects' listed like: o1 (MyClass), o2 (SomeOtherClass) as the starting context. Combine this with programatic tool calling and there you go. Much much more compact. Binds well to actual code and very flexible. This is the obvious direction things are going. I just wish Anthropic and OpenAI would realize it and define it/train models to it sooner rather than later.edit: I should also add that inline response should be part of this too: The model should be able to do ```<code here>``` and keep executing with only blocking calls requiring it to stop generating until the block frees up. so, for instance, the model could ```r = start_task(some task)``` generate other things ```print(r.value())``` (probably with various awaits and the like here but you all get the point).
Reminds me a bit of the problem that GraphQL solves for the frontend, which avoids a lot of round-trips between client and server and enables more processing to be done on the server before returning the result.
And introduce a new set of problems in doing so.
I completely agree. I wrote an implementation of this exact idea a couple weeks ago https://github.com/Orange-County-AI/MCP-DSL
I'm not sure that we need a new language so much as just primitives from AI gamedev, like behavior trees along with the core agentic loop.
Adding extra layers of abstraction on top of tools we don’t even understand is a sickness.