Hacker News

rldjbpin 21 hours ago [ - ]

how would llm-d [1] work compared to distributed-llama? is the overhead or configuration too much to work with for simple setups?

[1] https://github.com/llm-d/llm-d/