I coded a level 8 orchestration layer in CI for code review, two months before Claude launched theirs.
It's very powerful and agents can create dynamic microbenchmarks and evaluate what data structure to use for optimal performance, among other things.
I also have validation layers that trim hallucinations with handwritten linters.
I'd love to find people to network with. Right now this is a side project at work on top of writing test coverage for a factory. I don't have anyone to talk about this stuff with so it's sad when I see blog posts talking about "hype".
Do you feel like you are still learning about the programming language(s) and other technologies you are using? Or do you feel like you are already a master at them?
Do you ever take the time to validate what one of the agents produces by going to the docs? Or is all debugging/changing of the code done via LLMs/agents?
I'm more like level 2 right now and genuinely curious if you feel like learning continues for you (besides with agentic orchestration, etc.) And if not, whether or not you think that matters.
I'm learning more than ever before. I'm not a master at anything but I am getting basic proficiency in virtually everything.
> Do you ever take the time to validate what one of the agents produces by going to the docs? Or is all debugging/changing of the code done via LLMs/agents?
I divide my work into vibecoding PoC and review. Only once I have something working do I review the code. And I do so through intense interrogation while referencing the docs.
> I'm more like level 2 right now and genuinely curious if you feel like learning continues for you (besides with agentic orchestration, etc.)
Level 8 only works in production for a defined process where you don't need oversight and the final output is easy to trust.
For example, I made a code review tool that chunks a PR and assigns rule/violation combos to agents. This got a 20% time to merge reduction and catches 10x the issues as any other agent because it can pull context. And the output is easy to incorporate since I have a manager agent summarize everything.
Likewise, I'm working on an automatic performance tool right now that chunks code, assigns agents to make microbenchmarks, and tries to find optimization points. The end result should be easy to verify since the final suggestion would be "replace this data structure with another, here's a microbenchmark proving so".
Got it. This all makes sense to me. Very targeted tooling that is specific to your company's CI platform as opposed to a dark factory where you're creating a bunch of new code no one reads. And it sounds like these level 8 agents are given specific permission for everything they're allowed to do ahead of time. That seems sound from an engineering perspective.
Also would be interested in an example of "validation layers that trim hallucinations with handwritten linters" but understand if that's not something you can share. Either way, thanks for responding!
> Also would be interested in an example of "validation layers that trim hallucinations with handwritten linters"
For code review, AI doesn't want to output well-formed JSON and oftentimes doesn't leave inline suggestions cleanly. So there's a step where the AI must call a script that validates the JSON and checks if applying the suggestion results in valid code, then fixes the code review comments until they do.
I got my own level 8 factory working in the last few days and it’s been exhilarating. Mine is based on OpenAI’s Symphony[1], ported to TypeScript.
Would be happy to swap war stories.
<myhnusername>@gmail.com
How much money have you made with this approach
I think the opposite question is more prevalent, how much money have you spent?
Not a small amount :)
I spend $140/mo on Anthropic + OpenAI subs and I use all my tokens all the time.
I've started spending about $100/week on API credits, but I'd like to increase that.
Still waiting for these software factories to solve problems that aren't related to building software factories. I'm sure it'll happen sooner or later, but so far all the outputs of these "AI did this whole thing autonomously" are just tools to have AI build things autonomously. It's like a self reinforcing pyramid.
AI agents haven't yet figured out a way to do sales, marketing or customer support in a way that people want to pay them money.
Maybe that won't be necessary and instead the agent economy will be agents providing services for other agents.
none yet!
... is that the purpose of life? The sole reason for doing anything?
With so much hype it's a valid question: "is this useful/practical, or just a fun rabbit hole/productivity porn". Money is the most obvious metric, feel free to inquire the parent about other possible metrics that might be useful to others instead of asking rhetorical questions.
Unfortunately, it's hard to quantize "How much fun did you have?"