The JSON output makes it easy to wrap as a tool for frameworks like LangGraph, but I would be worried about the latency. Since it is a CLI, you are likely reloading the whole model for every invocation. That overhead is significant compared to a persistent service where the model stays loaded in memory.