Any computer with a display has a GPU.
Sure, but integrated graphics usually lacks vram for LLM inference.
Which means that inference would be approximately the same speed (but compute offloaded) as the suggested CPU inference engine.
Sure, but integrated graphics usually lacks vram for LLM inference.
Which means that inference would be approximately the same speed (but compute offloaded) as the suggested CPU inference engine.