Hacker News

Note that these are Python-only results, the model will not do as well with other languages.

I'm glad to see more domain-focused SLMs, we need more of them! A programming focused MoE should work well across many languages.

rcarmo 10 hours ago [ - ]

If it writes functional Python instead of cosplaying as a Java programmer and cramming code with classes and accessors, it's already better than Opus...

nsingh2 16 hours ago [ - ]

Lots of confusion about what this model is actually focused on.

It is a cheap specialist for closed-world, verifiable reasoning tasks like math, self-contained coding problems, and similar.

"Closed-world" means the needed information is already in the context. It is not a tool-using agent that can discover missing context. "Verifiable" means answers are hard to generate but easy to check.

So no open ended research, repo wide agent work, factual Q&A, or SVG generation. More of a compact reasoning module for bounded problems.

nsingh2 14 hours ago [ - ]

To follow up on this, I had it solve a nasty ODE problem that I saw in the recent Mathematica 15 release post:

    Solve the following first-order ODE for f(x):

    ((-1 - 2*x)*f(x)*tan(1 + x - exp(-61 - 2*x)*f(x)/x)
    + exp(61 + 2*x)*x*(1 - x*tan(1 + x - exp(-61 - 2*x)*f(x)/x))
    + x*tan(1 + x - exp(-61 - 2*x)*f(x)/x)*f'(x)) = 0

    Find the general solution f(x).

And surprisingly it found a valid solution! Extra impressive because it runs 25 tok/s on my measly RTX 2070 super.

    f(x) = x*exp(61 + 2*x)*(1 + x - arccos(C/x))

    C is an arbitrary constant.

Apparently Mathematica 14.3 couldn't solve this ODE.

kanbankaren 7 hours ago [ - ]

Qwen 3.5 9B Q4_K_M solved this using 10K tokens in 5 mins on a RX 7600.

The answer is exactly what you have posted. I am impressed by Qwen!

le-mark 10 hours ago [ - ]

How do you know it’s a valid solution? Are you able to verify it yourself?

krageon 7 hours ago [ - ]

This is a math problem with a math solution. You verify it with math

kame3d 13 hours ago [ - ]

Interesting!

I just tried the quantized Q4_K_M from [1] in my RTX 2070 Super, it ran at 110 tok/s with 1800 tok/s prefill, and found the same solution to your prompt. It generated valid LaTeX for the answer but its reasoning trace uses mostly compact ASCII math notation. Took 3min 22s to answer, spending 22k tokens almost all on thinking.

[1] https://huggingface.co/prithivMLmods/VibeThinker-3B-GGUF

trick-or-treat 13 hours ago [ - ]

How do we know the solution isn't in the weights though?

skeledrew 12 hours ago [ - ]

If it can code well then once you put it in a loop with an interpreter it can do anything.