I do use both DSv4 the "normal" and the flash variant, non-locally. It works well, not exceptionally. And while it's cheap, I'd say that the difference between $1 per month vs $5 per month is not a big concern to me. IMO pricing is pretty competitive among open-weight models: https://huggingface.co/inference/models
Depending on use cases, but for me I found 2 use cases where a local model is a must and not optional:
- Running offline without internet access: for example, I have this project that allow transcribe and summarize audio in real time. I already used it in some events where wifi is not available: https://github.com/ngxson/llama.cpp-realtime-audio-recap
- Handle private personal data, for example health records. This is the same category of "privacy" that you mentioned, but I just want to bring up the fact that people value their privacy differently.