Granite Switch is an open-source IBM Research project for composing several task-specific LoRA adapters into a single deployable Granite model checkpoint.
The idea is to get the accuracy benefits of multiple fine-tuned models without having to deploy and maintain a separate model for every task. It adds control tokens and a small switch layer that decides which adapter weights to apply, so different capabilities can be activated inside one model.
The composed model is designed to work with Hugging Face and vLLM, and the project includes ready-to-use adapters and pre-composed Granite Switch models.
Repo: https://github.com/generative-computing/granite-switch