Silly question: how is it different from, say, hf's transformers and similar libraries and APIs?

with hf transformers, you still need to manage GPUs