This is way more interesting to me as well. I have projects that use small limited-purpose language models that run on local network servers and something like this project would be a lot simpler than manually configuring API clients for each model in each project.
Thanks for raising it! Since vLLM has an OpenAI-compatible API, this should work for now:
I'll add a more convenient way to configure it in the coming days.