Our agentic builder has a single tool.
It is called graphql.
The agent writes a query and executes it. If the agent does not know how to do particular type of query then it can use graphql introspection. The agent only receives the minimal amount of data as per the graphql query saving valuable tokens.
It works better!
Not only we don't need to load 50+ tools (our entire SDK) but it also solves the N+1 problem when using traditional REST APIs. Also, you don't need to fall back to write code especially for query and mutations. But if you need to do that, the SDK is always available following graphql typed schema - which helps agents write better code!
While I was never a big fan of graphql before, considering the state of MCP, I strongly believe it is one of the best technologies for AI agents.
I wrote more about this here if you are interested: https://chatbotkit.com/reflections/why-graphql-beats-mcp-for...
Whoa there, you don't need to be so sadistic to your team. It's not GraphQL, but having a document describing how your API works, including types, that is important.
I expect you could achieve the same with a comprehensive OpenAPI specification. If you want something a bit stricter I guess SOAP would work too, LLMs love XML after all.
One of my agents is kinda like this too. The only operation is SPARQL query, and the only accessible state is the graph database.
Since most of the ontologies I'm using are public, I just have to namedrop them in prompt; no schemas and little structure introspection needed. At worst, it can just walk and dump triples to figure out structure; it's all RDF triples and URIs.
One nice property: using structured outputs, you can constrain outputs of certain queries to only generate valid RDF to avoid syntax errors. Probably can do similar stuff with GraphQL.
That is also the approach we took with Exograph (https://exograph.dev). Here is our reasoning (https://exograph.dev/blog/exograph-now-supports-mcp#comparin...). We found that LLMs do a very good job of crafting GraphQL queries for the given schema. While they do make mistakes, returning good descriptive error messages make is easy for them fix queries.
This is actually a really good use of graphql!
IMO the biggest pain points of graphql are authorization/rate limiting, caching, and mutations... But for selective context loading none of those matter actually. Pretty cool!
Isn't the challenge that introspecting graphql will lead to either a) a very long set of definitions consuming many tokens or b) many calls to drill into the introspection?
In my experience, this was the limitation we ran into with this approach. If you have a large API this will blow up your context.
I have had the best luck with hand-crafted tools that pre-digest your API so you don't have to waste tokens or deal with context rot bugs.
1000%
2 years ago I gave a talk on Vector DB's and LLM use.
https://www.youtube.com/watch?v=U_g06VqdKUc
TLDR but it shows how you could teach an LLM your GraphQL query language to let it selectively load context into what were very small context windows at the time.
After that the MCP specification came out. Which from my vantage point is a poor and half implemented version of what GraphQL already is.
your use-case is NOT Everyones use-case..(working in depth across one codebase or api but instead sampling dozens of abilities across the web or with other systems) thats the thing
how is that going to work with my use case, do a web search, do a local api call, do a graphql search, do an integration with slack, do a message etc..
Can anyone recommend an open source GraphQL-based MCP/tool gateway?
> It works better!
> I strongly believe it is one of the best technologies for AI agents
Do you have any quantitative evidence to support this?
Sincere question. I feel it would add some much needed credibility in a space where many folks are abusing the hype wave and low key shilling their products with vibes instead of rigor.
I have thought about this for all of thirty seconds, but it wouldn't shock me if this was the case. The intuition here is about types, and the ability to introspect them. Agents really love automated guardrails. It makes sense to me that this would work better than RESTish stuff, even with OpenAPI.
This is a task I think is suited for a sub agent that is small in size. It can can take the context beating to query for relevant tools and return only what is necessary to the main agent thread.
Better than rest is a low bar though. Ultimately agents should rarely be calling raw rest and graphql apis, which are meant for programmatic use.
Agents should be calling one level of abstraction higher.
Eg calling a function to “find me relevant events in this city according to this users preferences” instead of “list all events in this city”.
Same in terms of time spent. The hypothesis graphql is superior passes the basic sniff test. Assuming graphql does what it says on the tin, which my understanding is it does based on my work with Ent, then the claim it’s better for tool and api use by agents follows from common sense.
If you knew GraphQL, you may immediately see it - you ask for specific nested structure of the data, which can span many joins across different related collections. This is not the case with common REST API or CLI for example. And introspection is another good reason.
I've seen a similar setup with an llm loop integrated with clojure. In clojure, code is data, so the llm can query, execute, and modify the program directly
I do think that using graphql will solve a lot of problems for people but it's super surprising how many people absolutely hate it.
GraphQL is just a typed schema (good) with a server capable of serving any subset of the entire schema at a time (pain in the ass).
It doesn’t actually require that second part. Every time I’ve used it in a production system, we had an approved list of query shapes that were accepted. If the client wanted to use a new kind of query, it was performance tested and sometimes needed to be optimized before approval for use.
If you open it up for any possible query, then give that to uncontrolled clients, it’s a recipe for disaster.
Oh, we have that too! But we call it HTTP endpoints.
GQL is an HTTP endpoint. The question is, how are you schematizing, documenting, validating, code-generating, monitoring, etc. the request and response on your HTTP endpoints? (OpenAPI is another good choice.)
Really? Hmm... where in the HTTP spec does it allow for returning an arbitrary subset of any specific request, rather than the whole thing? And where does it ensure all the results are keyed by id so that you can actually build and update a sensible cache around all of it rather than the mess that totally free-form HTTP responses lead to? Oh weird HTTP doesn't have any of that stuff? Maybe we should make a new spec, something which does allow for these patterns and behaviors? And it might be confusing if we use the exact same name as HTTP, since the usage patterns are different and it enables new abilities. If only we could think of such a name...
An HTTP Range request asks the server to send parts of a resource back to a client. Range requests are useful for various clients, including media players that support random access, data tools that require only part of a large file, and download managers that let users pause and resume a download.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Ran...
HTTP Range doesn't have anything to do with allowing a client to select a subset of fields.
The Range header isn't for requesting a subset of a resource from the server?
Let's say your REST endpoint returned an object with keys foo, bar, baz and quuz. How would you use HTTP Range to only select foo and baz?
also handy for bypassing bandwidth restrictions: capped at 100kbps? launch 1000 workers to grab chunks then assemble the survivors
that's what axel downloader does!
Etag and cache control headers?
Without wishing to take part in a pile on - I am wondering why you're using graphql if you are kneecapping it and restricting it to set queries.
Because it solves all sorts of other problems, like having a well-defined way to specify the schema of queries and results, and lots of tools built around that.
I would be surprised to see many (or any) GQL endpoints in systems with significant complexity and scale that allow completely arbitrary requests.
Shopify's GraphQL API limits you in complexity (essentially max number of fields returned), but it's basically arbitrary shapes.
OpenAPI does the same thing for http requests, with tooling around it.
With typed languages you can auto-generate OpenAPI schemas from your code.
Yep, OpenAPI is also a good choice nowadays. That’s typically used with the assumption you’ve chosen a supported subset of queries. With GQL you have to add that on top.
Probably for one of the reasons graphql was created in the first place - accomplish a set of fairly complex operations using one rather than a multitude of API calls. The set can be "everything" or it can be "this well-defined subset".
You could be right, but that's really just "Our API makes multiple calls to itself in the background"
I could be wrong but I thought GraphQL's point of difference from a blind proxy was that it was flexible.
It is flexible, but you don’t have to let it be infinitely flexible. There’s no practical use case for that. (Well, until LLMs, perhaps!)
I guess that I'm reading your initial post a little more strictly than you're meaning
I think they mean something like (or what I think of as) “RPC calls, but with the flexibility to select a granular subset of the result based on one or more schemas”. This is how I’ve used graphql in the past at least.
> I am wondering why you're using graphql if you are kneecapping it and restricting it to set queries.
Because you never want to expose unbounded unlimited dynamic queries in production. You do want a very small subset that you can monitor, debug, and optimize.
No.
It's a way to transmit a program from client to server. It then executes that program on the server side.
That sounds even worse!
I wish people at least stopped using JavaScript and stopped writing requests to back-end by hand.
Reading this was such an immediate "aha" for me. Of course we should be using GraphQL for this. Damn. Where was this comment three months ago!