That's DS4 Flash right? How does it feel in intelligence and speed compared to DS4 Flash hosted by Deepseek themselves or another API provider? I've been using API DS4 Flash for a lot of personal projects and have been quite impressed. I've spent $1 on building ~10 toy projects and gotten them all to work within the bounds of what I wanted without having to do much besides guide the model away from dumb loops.

I'm using the DS4 flash IQ2 2-bit quant, per Salvadore's recommendations for my hardware in the repo. I haven't messed with the cloud hosted variant. The only other paid API I have messed with is a $20 Anthropic sub, primarily with whatever the latest version of Sonnet is. For the most part, this local configuration feels on par with that.

With this configuration (set up over the last month) I have been working on Python data processing tools, an internal Svelte 5/SvelteKit data intensive BI app, and some smaller Rust projects. It's been doing really well there.