Hacker News

FuckButtons 13 hours ago [ - ]

It’s shocking how close this feels to claude, obviously it's much slower, but I don’t know that it’s significantly dumber. Interestingly the imatrix quantization seems to be better than whatever quant the zdr inference backends on open router are using. It was self aware enough yesterday to realize that it’s own server process was itself without me telling it, which is not something I’ve ever observed a local model doing before.

stavros 12 hours ago [ - ]

In my (obviously anecdotal) testing, DeepseekV4 Pro was better than Sonnet at coding. However, it is much slower, but also many times cheaper, especially with the promotion right now.

DeathArrow 9 hours ago [ - ]

Do they have a coding plan or you only pay per API call?

trollbridge 8 hours ago [ - ]

It’s just per token, but burning up 100 million+ tokens is a $3 transaction with their pricing right now

DeathArrow 7 hours ago [ - ]

Do you use the official API or another provider?

trollbridge 2 hours ago [ - ]

Just directly. Paid for it with PayPal. It’s quite simple to set up and use.

stavros 5 hours ago [ - ]

I use the official API, OpenRouter somehow didn't use caching and one short session with Qwen cost me $5.

ReptileMan 3 hours ago [ - ]

You pay per api call but you will be challenged to burn trough 20$ per month. 24/7 usage for single agent will probably cost you around 100$ per month. It is very efficient especially with modern harnesses.