What's the performance of a model like vs an OpenAI API? What's the comparable here? Edit: I see it's same models locally that you'd run using Ollama or something else. So basically just constrained by the size of the model, GPU and perf of the machine.
Yes, its very similar to Ollama app, and Llama-3.2-1B model used