Problem is that it really eats all resources when using a llm locally. I tried it. But the whole system becomes unresponsive and slow. We need minimum of 1tb memory and dedicated processors to offload.