Koji, a AI Backend Orchestration/Managment/Proxy layer: https://github.com/danielcherubini/koji

I've been running models on my homelab for a bit now, but none of the available options out there was what I wanted. I wanted something that I could command from the CLI, API, or Web, so have an agent go in and do work remotely via SSH or myself via a web interface.

I wanted the ability to know if models have been updated, and if backends (llama.cpp, ik_llama.cpp) have been updated, see what those updates are and choose to update. Also wanted the ability to switch betwen versions of those, so if I felt like there was a regression, or performance issue, I could roll back.

I've also published plugins for OpenCode and Pi so that model discovery is automatic too.

I'm building this mostly for me, as usual.