Hacker News

> What provider do you use?

1. My own harness + Local (which usually means Qwen3.6-35B-A3B), I use this fairly often for research gathering on topics, info gathering on code bases, etc.

2. My own harness + DeepSeek v4 Flash served by DeepSeek, I added $20 quite some time ago and somehow still have $18.77 in there after I don't know how many prompts. I use this pretty often, slightly less than my local setup, it's great and what I'm planning on running locally (eventually).

3. My own harness + OpenRouter with whichever model I want to try out. I use this very rarely.

4. Pi + OpenAI Codex $20 subscription. I don't use this almost at all anymore, but I keep the Codex subscription for testing things out to see how GPT-5.5 will handle a problem the other setups have issues with.

> Why do you trust it with serving full quality?

The only thing I've noticed seems unbearably useless sometimes versus what I noticed before was GPT-5.5 which has had some of the weirdest degradations I've seen. It's not to Anthropic levels but it definitely had some service issues a few times where I was wondering if they had accidentally (or purposefully) lobotomized it.

Everything else has mostly just been the same, except DeepSeek I noticed had some speed issues a few days ago.

> What harness do you use? Why do you trust it not to have malware (most harnessed are TS apps)?

I pretty much only use my own, agents are trivial to make and it's definitely not hard to make one that's better than Claude Code or Codex for whatever you're doing.

mark_l_watson 7 hours ago [ - ]

I want to say that I agree with you on the value of writing your own coding harness. I wrote something simple in Emacs Lisp and it makes me happy occasionally using it. I am trying to learn Rust and I am working on my own Rust core orchestration layer and I plan on both a Rust command line client and I already have a Python library wrapper for the Rust code that I have written so far. I write a lot of ‘little books’ and I am almost sure to write yet another one on my current hacking project.

Are my little hacks as effective as OpenCode or Claude Code? No way, but I am learning a lot and having fun.

timcobb 2 hours ago [ - ]

Do you write /maintain evals? This is something I want to get into more. Otherwise I feel really blind and feel compelled to just drop money on frontier.

59nadir 2 minutes ago [ - ]

Not really. I have a benchmark I made for fun where I let LLMs control a text editor called Kakoune, and then give them no other way to do things, to see how they deal with it, but that's not really a scenario I expect them to do well at.

So far most of them have done very poorly on that one, because they are all overtrained on just executing shell commands.

A former colleague of mine and I made a simple test for some baseline "Everything worth using should be able to do this pretty easily and swiftly" but that's some very minor code generation with a very straight forward, boilerplate-type pattern.