> They are cute but for serious coding they tend to waste your expensive time.

90% of corporate job tasks are trivial enough that Haiku can handle them.

Just this morning I have been implementing a reprint functionality in our warehouse management system, which needed to print again carrier labels and delivery notes for a specific order.

It essentially had to do the same workflow of print, but instead of generating and uploading the pdfs, it only had to fetch and print them.

Took Opus 4.8 high 24m1 seconds and 87k tokens. Took Haiku 6m30 seconds and half the tokens.

So not really sure what do you mean by "wasting your expensive time" here. I think you really don't experiment with these tools and assume higher effort, bigger model => time saved, but that's true only when tasks are much bigger and complex enough that a smaller/less precise model would fail or land work of much lower quality.

Unfortunately there's no defending Haiku 4.5 at this point when cheaper and better options are available.

TLDR:

https://artificialanalysis.ai/models?models=gemini-3-5-flash...

and: https://i.imgur.com/nTu3VCZ.png

For starters I did experiment a heck lot with models since Github Copilot gave me access to OpenAI, Gemini and Anthropic models. So I probably experimented more than the average LLMer. When GitHub Copilot had a generous quota I ran the same tasks with many models to compare them (and pursue best solution among them) quite often.

Now about my experience with Haiku, I think it was free for some time in GitHub Copilot, then it was 0.33x quota usage (when Sonnet was 1x and Opus was 3x, good times). I tried to use it for light coding for about a week.

In my tests I concluded that there was zero reason to use 0.33x priced Haiku in my coding workload because it constantly generated subpar solutions. Even when they worked, Sonnet at 1x and Opus at 3x quota usage had a lot less tech debt on average and my plan permitted continuous Sonnet/Opus usage for my workload, otherwise I would use Gemini Flash (the old one, not this 3.5 one) which was better than Haiku by a mile.

Then GPT 5.4 came at 1x quota usage and it was competitive with Opus at 3x quota usage. So I stopped using Opus in favor of GPT and by this time there was even less reason to use Haiku on my $39/mo GitHub Copilot plan.

And now we have DeepSeek v4 which is Sonnet+ levels in my tests because it has an actual 1 million token context window and their crazy alien caching tech (https://huggingface.co/blog/deepseekv4).

I urge you to throw $5 at OpenCode Go plan for 30 days and toy around with DeepSeek Flash on high setting (not max).

Or MiMo 2.5 Pro on the same OpenCode Go plan. 2 amazing models.

> DeepSeek Flash on high setting

In your experience, is max worse or you suggest it for less token use?

> MiMo 2.5 Pro on the same OpenCode Go

Xiaomi dropped dropped MiMo 2.5 rates by 70%+ [0] & now it is cost competitive with DeepSeek v4 Pro. I haven't used MiMo, but since you have, do you find it to be better than DeepSeek v4? If so, for what tasks? How do you decide when to use which, if you have an intuition for it? Thanks.

[0] https://news.ycombinator.com/item?id=48282814

> In your experience, is max worse or you suggest it for less token use?

Yes. DS4 Flash max is incredibly chatty for minimal gain over DS4 high.

I asked the same question a month ago: https://news.ycombinator.com/item?id=47978820 and confirmed in my tests.

> ...MiMo, but since you have, do you find it to be better than DeepSeek v4?

I didn't test MiMo 2.5 enough to form a veridict but from initial tests it is equivalent to DS4. But MiMo 2.5 (non Pro) has the advantage of having vision capability and MiMo is priced equaly as DeepSeek v4 in the $10/mo OpenCode Go now, after the discount you mentioned, see the yellow bars at https://opencode.ai/go

I'll start testing MiMo seriously next week.