Hacker News

If you self host then you can audit the open-source llama.cpp or whichever other program you are using for inference, to see exactly what it does, and also whichever open-source harness you use for implementing a coding assistant or other agentic workflow.

The model consists of a bunch of data files, it does absolutely nothing by itself.

If you run inference on your own hardware, you have absolute control on how the LLM is used, not like when you use an external service provider.