Hacker News

angarg12 14 hours ago [ - ]

> This exact thing is what software developers have been begging for since the beginning of the profession: Receiving a detailed outline of the problem and what the end result should look like.

> This is often the part that slows down software development. Trying to figure out what a vague, title only, feature request actually means.

But that is exactly what Software Engineering is!. It's 2026 and the notion that you can get detailed enough requirements and specifications that you can one-shot a perfect solution needs to die.

In my experience AI has made us able to iterate on features or ideas much faster. Now most of the friction comes from alignment and coordination with other teams. My take is that to accelerate processes we should reduce coordination overhead and empower individuals and teams to make decisions and execute on them.

pron 14 hours ago [ - ]

> It's 2026 and the notion that you can get detailed enough requirements and specifications that you can one-shot a perfect solution needs to die.

It's 2026 and the idea that even with detailed-enough requirements you can one-shot even a workable (let alone perfect) solution also needs to die. Anthropic failed to build even something as simple as a workable C compiler, not only with a perfect spec (and reference implementations, both of which the model trained on) but even with thousands of tests painstakingly written over many person-years. Today's models are not yet capable enough to build non-trivial production software without close and careful human supervision, even with perfect specs and perfect tests. Without a perfect spec and a perfect human-written test suite the task is even harder. Maybe in 2027.

ianbutler 14 hours ago [ - ]

Sorry where are we seeing that it failed? It compiled multiple projects successfully albeit less optimized.

" It lacks the 16-bit x86 compiler that is necessary to boot Linux out of real mode. For this, it calls out to GCC (the x86_32 and x86_64 compilers are its own).

It does not have its own assembler and linker; these are the very last bits that Claude started automating and are still somewhat buggy. The demo video was produced with a GCC assembler and linker.

The compiler successfully builds many projects, but not all. It's not yet a drop-in replacement for a real compiler. The generated code is not very efficient. Even with all optimizations enabled, it outputs less efficient code than GCC with all optimizations disabled.

The Rust code quality is reasonable, but is nowhere near the quality of what an expert Rust programmer might produce. "

For faffing about with a multi agent system that seems like a pretty successful experiment to me.

Source: https://www.anthropic.com/engineering/building-c-compiler

Edit: Like I think people don't realize not even 7 months ago it wasn't writing this at all.

pron 13 hours ago [ - ]

> where are we seeing that it failed?

Anthropic said the experiment failed to produce a workable C compiler:

- I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality.

- The compiler successfully builds many projects, but not all. It's not yet a drop-in replacement for a real compiler.

(source: https://www.anthropic.com/engineering/building-c-compiler)

Software that cannot be evolved is dead software. That in some PR communications they misrepresented their own engineer's report is beside the point.

> It compiled multiple projects successfully albeit less optimized.

150,000x slower (https://github.com/harshavmb/compare-claude-compiler) is not "less optimised". It's unworkable.

> Like I think people don't realize not even 7 months ago it wasn't writing this at all.

There's no doubt that producing a C compiler that isn't workable and is effectively bricked as it cannot be evolved but still compiles some programs is great progress, but it's still a long way off of auonomously building production software. Can today's LLM do amazing things and offer tremendous help in software development? Absolutely. Can they write production software without careful and close human supervision? Not yet. That's not disparagement, just an observation of where we are today.

tardedmeme 3 hours ago [ - ]

This evaluation appears to be AI-written itself. It claims a 3x slowdown and a 4x slowdown combine to produce a 158000x slowdown "because there are billions of iterations" - yeah well both versions of the program had the same number of iterations.

Does anyone know how the 158000x slowdown happened? That's quite ridiculous.

ianbutler 12 hours ago [ - ]

> Can they write production software without careful and close human supervision? Not yet. That's not disparagement, just an observation of where we are today.

I never claimed they could! I just view this as a successful experiment. I don't think anthropic was making that claim with their experiment either.

It feels reflexive to the moment to argue against that claim, but I tend to operate with a bit more nuance than "all good" or "all bad".

areweai 5 hours ago [ - ]

I think people are concerned about the large discrepancy in concrete claims in your previous comment and subsequent empirical information. You may have seen a headline or skimmed an article and missed some details, not a big deal.

The overall impression given was inaccurate and the implicit claim of a fully working end-to-end generated compiler was inaccurate. The headlines were incomplete in a way that was intentionally misleading. It was an interesting experiment and somewhat impressive but the claims were overblown. It happens.

4 hours ago [ - ]

[deleted]

pron 12 hours ago [ - ]

The experiment failed to produce a workable C compiler despite 1. the job not being particularly hard, 2. the available specs and tests are of a completely higher class of quality than almost any software, not to mention the availability of other implementations that the model trained on.

You can call that a success (as it did something impresssive even though it failed to produce a workable C compiler) but my point in bringing this up was to show that today's models are not yet able to produce production software without close supervision, even when uncharacteristically good specs and hand-written tests exist.

ianbutler 11 hours ago [ - ]

That's great and all, but that's not the point I was making and you're engaging rather uncharitably on it. So when you view it from the perspective of capability increase it's rather impressive. Note the slope of progress which this experiment was to show.

Edit: Maybe uncharitably is too strong, but we're talking past each other.

auggierose 9 hours ago [ - ]

pron made this statement:

> It's 2026 and the idea that even with detailed-enough requirements you can one-shot even a workable (let alone perfect) solution also needs to die.

and brought up the failed anthropic experiment as proof of that. Yes, you are talking past each other, but that is not pron's fault. It is your fault.

ianbutler 9 hours ago [ - ]

Eh fair enough!

KajMagnus 11 hours ago [ - ]

Saying the model failed to write a competitive C compiler makes more sense.

I don't think they tried to do that though.

> today's models are not yet able to produce production software without close supervision, even when uncharacteristically good specs and hand-written tests exist.

That's a good point anyway

pron 10 hours ago [ - ]

> Saying the model failed to write a competitive C compiler makes more sense.

Their compiler fails to compile (well, at least link) some C programs altogether, and in other cases it produces code that is 150,000x slower than a real C compiler with optimisations turned off (interestingly, the model trained on the real compiler's source code). That's not "not competitive" but "cannot be used in the real world". But even more importantly, the compiler cannot be fixed or evolved. It's bricked (at least as far as today's models' capabilities go). For any kind of software, not being able to improve or fix anything or add any new feature means it's effectively dead.

You could not use it in production even if no other C compiler existed.

jiggawatts 7 hours ago [ - ]

While I understand both points of view, I'm leaning towards yours, because:

- John Carmack embedded a C compiler and interpreter/runtime into Quake back in the mid 1990s as a scripting language! It was that efficient that it could be used in a real time 3D shooter. That's a solo effort as a minor component of a much larger piece of software.

- I've seen university CS courses hand out "implement a C compiler" as a homework / project exercise for students. It's not particularly difficult.

Sure, a modern C compiler like GCC has to handle inline assembly, various extensions, pragmas, intrinsics, etc... but like you said, all of those are thoroughly documented and have open source implementations to reference.

Similarly, the Rust compiler is implemented in Rust and could be used as an idiomatic reference for a generic compiler framework with input handling, parsing, intermediate representations, and so forth.

lmm 8 hours ago [ - ]

> Their compiler fails to compile (well, at least link) some C programs altogether, and in other cases it produces code that is 150,000x slower than a real C compiler with optimisations turned off

I would bet that those things are also true of at least one expensive commercial C compiler.

vajrabum 4 hours ago [ - ]

I'd love to hear of any currently available commerical C compiler which has that level of issues. I would bet you'll be hard pressed to find one. C compilation is a quite thoroughly solved problem. In any case please provide an example.

11 hours ago [ - ]

[deleted]

josephg 2 hours ago [ - ]

> Sorry where are we seeing that it failed?

Try it yourself.

I've been using claude to make a project over the last few weeks. Its written ~70k LOC to solve a complex problem. I've found that it can get surprisingly far in a 1-shot, but about 90% of the work I've had it do (measured in time and tokens) is cleaning up the junk it outputs in its first pass. I'm finding my claude sessions have a rhythm like this:

1. Plan and implement some new feature.

2. Perform a code review of what you just did. Fix obvious problems. Flag bugs, issues, poor factoring, messy abstractions, etc. Make a prioritised list of things to fix (then fix them).

3. (Later) fixes:

- Write tests for the code you wrote and fix the bugs you find.

- Run the code through memory leak checks, and fix bugs.

- Do a performance analysis using benchmarks and profiling tools, and make any high priority performance improvements.

- Read the whole program, looking for ways in which the code you've just written could fit in better with the rest of the program. Fix any issues.

- In directory X is the full documentation for the library you're using. Reread it then review the code you wrote. Are there better ways we could make use of the library?

And so on.

Claude's 1-shot output is often usable, but its consistently chock full of problems. Bugs. Memory leaks. Bad factoring. Too many globals. Poor use of surrounding code. And so on. Its able to fix many of these problems itself if you prompt it right. (Though even then the code is often still pretty bad in many ways that seem obvious to me).

At the moment I think I'm spending tokens at about a 1:9 ratio of feature work to polish. Maybe its 1-shot output is good enough quality for you. To me its unacceptable. Maybe a few models down the line. But its not there yet.

mh- an hour ago [ - ]

The ratio is an interesting way of thinking about it. I wonder how this compares to other SWEs at various levels of experience, replacing tokens for person-hours.

nvme0n1p1 12 hours ago [ - ]

Why are you quoting from their marketing blog as if it's a reliable source?

https://github.com/anthropics/claudes-c-compiler/issues/1

> Apparently compiling hello world exactly as the README says to is an unfair expectation of the software.

dnautics 14 hours ago [ - ]

Yeah I think people are really underestimating what LLMs can do even without specs.

As an example, I did an exploratory attempt to add custom software over some genuinely awful windows software for a scientific imaging station with a proprietary industrial camera. Five days later Claude and I had figured out how to USB-pcap sample images and it's operationalized and smoothly running for months now. 100% of the code written by Claude, it's all clean (reviewed it myself) pretty much all I did was unstuck it at a few places, "hey based on the file sizes it looks like the images are being sent as a 16-bit format")

For day to day work, I'll often identify a bug, "hey, when I shift click on this graphical component, it's not doing the right thing". I go tell Claude to write a RED (failing) integration test, then make it pass.

Zero lines of code manually written. Only occasionally do I have to intervene and rearchitect. Usually thus involves me writing about ten lines of scaffold code, explaining the architectural concept, and telling it to just go

pron 13 hours ago [ - ]

People both underestimate and overestimate what LLMs can do. LLMs have shown very different results when autonomously writing a small program for personal use and autonomously writing production software that needs to be evolved for years.

jyounker 9 hours ago [ - ]

By "non-workable" I think people mean that it won't compile Hello World.

YZF 11 hours ago [ - ]

GCC has only like a billion man hours in it?

Assembler and linker are not part of a compiler. They are separate tools. They are also generally much simpler.

SirHumphrey 14 hours ago [ - ]

Most software is much simpler than a c compiler.

pron 14 hours ago [ - ]

A workable C compiler is a ~10-50KLOC program, and a fairly simple one at that (batch, with no concurrency or interaction). That Anthropic's swarm of agents wrote 100KLOC before failing is a symptom of the problem. It's certainly possible that many programs are in the sub 5KLOC range, but it's definitely not "most software". Plus, almost no software has this level of detailed spec, ready-made tests, and a selection of existing implementations of the same spec.

My first thought when reading Anthropic's description of the experiment was that it is unrealistically easy. It's hard to come up with realistic jobs in the 10-50KLOC range that would be this easy for an LLM. That it failed only shows how much further we still have to go.

quantumleaper 14 hours ago [ - ]

A bit off topic, but see how Anthropic publicity stunts went from "Claude C Compiler" with 100K LOC to the recent Bun Rust rewrite with 1M LOC (10x!) in just 3 months.

I get that it's "novel" creation vs porting, but given that they reported that the C compiler cost them $20k in API costs, the Bun rewrite must be at least $200k, maybe even closer to a million. Pure madness.

gmueckl 13 hours ago [ - ]

Asking an LLM tp change programming language of an implementation is completely different from asking it to code from spec. It's orders of magnitude simpler in practice. I converted some 60kloc of Java to C++ and it works. There were some issues where the Java implementation used runtime reflection because that needs creative workarounds and not all of the C++ translations worked on the first try. And that was my first serious attempt at a task with an LLM. I could likely do better now. An important task simplification here is that a well designed codebase can be converted in small pieces and then joined back together. So the total amount of code converted becomes an irrelevant metric.

pron 14 hours ago [ - ]

Yes, the task is very different, but also it will be months to a year until we know the results of the bun experiment.

quantumleaper 14 hours ago [ - ]

I don't know how it could fail - Bun loses popularity among devs? Is it an objective metric? From what I understand, Node.js remains dominant across the industry as a whole, with Deno and Bun mostly used by startups.

Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success, and there would be plenty of AI-sphere startups already drinking the kool-aid that would consider the whole vibe-coding thing to Bun's benefit.

pron 13 hours ago [ - ]

> Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success,

Can they, though? They tried and failed to do it in their C compiler experiment. The experimenter wrote: "I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality."

eudamoniac 13 hours ago [ - ]

It could fail due to maintenance burden. There is a lot of code now that no one wrote.

t_mahmood 12 hours ago [ - ]

Are we assuming, all tests pass == software done?

Do Firefox not have tests? Then how was there over 200 CVEs found?

Are we going to be comfortable running a piece of software that has 1M lines, and who knows how many zero-days will be in it.

Yes, sure they are going to use LLM to find the CVE's, and so will the hackers. You need a day or two to fix the security issue, a hacker just need to put it in use.

And good luck debugging a million line code base.

1M LOC == already failed.

rowanG077 13 hours ago [ - ]

The compiler that claude made went way beyond workable. It could compile the full linux kernel afaik. That is much further even beyond standard C.

pron 13 hours ago [ - ]

People who independently tried to use it reported that it is very much not workable:

- "CCC compiled every single C source file in the Linux 6.9 kernel without a single compiler error (0 errors, 96 warnings). This is genuinely impressive for a compiler built entirely by an AI. However, the build failed at the linker stage with ~40,784 undefined reference errors."(https://github.com/harshavmb/compare-claude-compiler)

- Overall it’s an interesting experiment, and shows the current bleeding edge of Claude’s Opus 4.6 model. However the resulting product is also a clear example of the throwaway nature of projects generated almost entirely by AI code agents with little human oversight. The prototype is really impressive, but there is no real path forward for it to be further developed. It can build the Linux kernel [for RISC-V], which is impressive. It can also build other things… if you are lucky, but you really cannot rely on it to work. (https://voxelmanip.se/2026/02/06/trying-out-claudes-c-compil...)

Anthropic themselves said that the codebase was effectively bricked and that their agents could not salvage it.

rowanG077 10 hours ago [ - ]

Well then as you say a 10-50KLOC C compiler is workable. Could you show me the C compiler that does manage to compile a modern Linux kernel that is of that size?

spc476 8 hours ago [ - ]

TCC did several years ago. It could boot Linux from source in under 10 seconds. It's wasn't that big of a C compiler. It's in the 50,000 lines of code range.

rowanG077 6 hours ago [ - ]

This was 20 years ago from what I can find. Beside that Linux now is a vastly different codebase than it was 20 years ago. That effort also did not compile Linux unmodified, it required several changes: https://bellard.org/tcc/tccboot_readme.html.

binary0010 14 hours ago [ - ]

Not really.

I can make a c compiler in a couple weeks just by looking up open source libraries and copying them.

I can't make any software that people will pay me money to use without taking months/years of development, research, expiramentation and iteration.

Just because the original people who invented compilers had to be genius, doesn't mean anyone has to spend much time or thought in copying that work now.

YZF 12 hours ago [ - ]

I built a compiler for a simpler language as part of my compilers course in a CS degree. It was a non-trivial exercise well beyond the majority of software applications. What open source libraries did you have in mind and what are you copying?

If you can truly write a C compiler in weeks then kudos to you. How many compilers have you written so far for how many languages?

I work for big tech and I would say a large % of developers are incapable of producing a working C compiler on any reasonable time scale, certainly not weeks, even with looking at open source. I'm sure they can download one and run it. Most developers today don't even know C or assembler. They don't know how to approach the C language spec. The top 5-10% of developers/engineers can do it but even for them it's non-trivial.

thayne 2 hours ago [ - ]

> It was a non-trivial exercise well beyond the majority of software applications

Maybe if you include every application ever written, including every variation of "hello world", but if you are claiming that most serious production quality software could be written by a CS student who is simultaneously working on other classes, I'm gonna have to disagree with you.

binary0010 8 hours ago [ - ]

I'd copy and paste from all the thousands of open source ones, what do you mean?

There are plenty of open source compilers that I can copy and paste whatever I need to. I don't get why you think this would have any level of difficulty?

Of course I couldn't make a brand new compiler that was better than what's out there...

Just like a game engine, I could clone one of the thousands of engines out there pretty easily - making something better or novel would be difficult. Just making a bare bones clone of what already exists by referencing documentation and pre-existing code is relatively easy now.

Yeah, when I made a mediocre 3d game engine 20 years ago, it was brain breaking difficult work. I can make one infinitely better in a micro fraction of the time now because most of the hard stuff is done and can just be looked up now.

Do you not agree?

YZF 7 hours ago [ - ]

If you copy and paste an entire compiler you didn't make anything. If you copy pieces from different compilers they won't work together. So I'm not sure how you "make" a compiler with copying and pasting from open source compiler. Are you saying you'll take one file from clang, one from gcc, another another from another compiler?

Sure. You can clone gcc and build it. You can close a game engine and use it.

pron 12 hours ago [ - ]

> It was a non-trivial exercise well beyond the majority of software applications

That depends on how you count. By number of programs that may well be right, but that's not what matters in terms of impact on the industry, as software value roughly corresponds to the number of people working on a particular piece of software (or lines of code, if you wish). By number of people/LOC most software is not in the "simpler than a C compiler" category.

tardedmeme 2 hours ago [ - ]

I do think being able to write a compiler is a milestone indicator of your computer science knowledge. Most developers probably don't understand pointers either, because "most developers" are people who did a React bootcamp.

13 hours ago [ - ]

[deleted]

virgilp 11 hours ago [ - ]

I wonder how knowledgeable in compilation was the engineer that attempted this. I'm pretty confident that I could produce a decent C compiler in a few weeks (or less), if given Opus 4.7 + unlimited tokens + a good test suite. (and this is not blind unsubstantiated belief in AI, I've recently rewritten a somewhat sophisticated interpreter in a week with AI; and have worked on several C++ compilers in the past, including a GCC port to a custom DSP, so I have a bit of an idea about what this would take).

But yeah, this is not a "one shot" project, none of it is. One shot doesn't work even with humans - after all, this is exactly what killed waterfall as a methodology.

pron 11 hours ago [ - ]

> I'm pretty confident that I could produce a decent C compiler in a few weeks (or less), if given Opus 4.7 + unlimited tokens + a good test suite.

Of course. The point is that a full, detailed spec isn't enough (even in the rare situations it does exist, like for a C compiler). At least for the moment, you need expert humans to supervise and direct the agents.

Vibe coders usually also let the agents write the tests, which mean that the only independent human validation of the software is some cursory manual inspection. That also obviously isn't enough to validate software.

> One shot doesn't work even with humans - after all, this is exactly what killed waterfall as a methodology.

You can one-shot a C compiler with humans. LLMs' software development ability is impressive and helpful, but it is not human-level yet, even if at some tasks the agents are better than most human programmers. And while many waterfall projects failed, many succeeded (although perhaps not as efficiently as they could have). So far I don't believe agents have been able to produce any non-trivial production software autonomously.

zem 11 hours ago [ - ]

yeah, the key part is that there be a human in the loop, directing and course-correcting the ai while it produces code in reasonably small and well defined stages.

juanre 13 hours ago [ - ]

I completely agree. It's more than 40 years since I wrote my first program, and I've never seen software that was first specified and then written and all was good.

The most difficult part of any non-trivial engineering is understanding the problem, and the first versions of a piece of software are how you reach that understanding.

That's why I do not think that AI-powered "software factories" will ever work. It's waterfall development all over again. An architect writing UML diagrams and handing them off to the team of programmers to do the essentially mundane task of implementing... the wrong thing.

AI is, however, very good at helping you go fast from the wrong first version to the less wrong second one. But you need to remember that your main task is to understand the problem that you are trying to solve.

daxfohl 9 hours ago [ - ]

Yeah and any detailed design is still likely to skip over "obvious" things like "only admin users can use admin features". Both the PM and the engineering team will understand this implicitly. But with AI, you never can tell if it's going to make that inference, or just create admin users and admin APIs with no relation between them. These are also the bugs that can most easily slip through, because the reviewer wouldn't even think to look for it.

Philip-J-Fry 12 hours ago [ - ]

I don't agree.

I regularly get pieces of work someone product guy has thought up in an afternoon. They only care about the happy path, and sometimes only part of the happy path. I work for a global company that has to abide by rules and regulations in each country we operate in. The product guy thinks up some feature, we implement the feature, then we're told "actually, we legally aren't allowed to do this in 90% of the markets we operate in". Cool, so we add an ability to disable it in those markets. Then they come back "We can do this in some of those markets if it's implemented with [regulatory bureaucracy], so can you do that please".

Then we have to hack away at the solution because the deadline is right around the corner.

This is not software engineering! None of this is related to the software. The job of a software engineer is to take a list of requirements and figure out the way we accomplish those requirements. Requirements gathering is NOT a software engineering problem. Software is implementation, product is behaviour. That's the split. The behaviour of the thing we're building needs to be known before we even try to seriously build it.

If someone just held back for week and did their due diligence, we would been able to architect a solution that is scaleable, extensible, easy to maintain and can make the future easier.

nuancebydefault 11 hours ago [ - ]

> Requirements gathering is NOT a software engineering problem. Software is implementation, product is behaviour. That's the split.

That's a theory but I've never seen this work in practice. A piece of software is unique. If it weren't, we'd just use the cp command.

What usually happens is you get a set of requirements that looks simple. Then you start thinking about a design and see 10 different possibilities, each corresponding to a slightly different interpretation of the requirements set. You iterate a few times reviewing the designs with who set the requirements and a few peers and see more possible variations to the requirements. You need to double check its parent requirements up to the master requirements. Then you need to take time/feature/quality tradeoffs, affecting the fulfillment of requirements.

Once starting to implement, you see dependencies to other software (framework, sdk, drivers, language features,...) and understand that other software is not what you thought, or has bugs. Or you see an issue with performance or see that one particular feature becomes unfeasible.

That's where all the complexity goes. AI doesn't change that, but can make prototyping iterations and bug hunting faster, as long as someone holds it on a leash and understands its decisions.

marcus_holmes 5 hours ago [ - ]

I think this was TFA's point about "engineers have been begging to be involved earlier in the process forever". Which is absolutely true.

It has to be someone's job to push back on the Product Guy's stupid idea and answer all the awkward questions about the not-so-happy path with it. Unfortunately, because of the way we've ended up with this process, that person is often the engineer tasked with building it, without any effective political power to challenge the design process.

sarchertech 7 hours ago [ - ]

You realize that we were making software for decades before Product Managers existed right?

My senior year software engineering class had a whole section on requirements gathering.

ajam1507 9 hours ago [ - ]

This seems more like a failure of management and process than a problem inherent to autonomy.

jimbokun 3 hours ago [ - ]

A previous iteration of my company had a CEO with a simple management idea that I believe worked really well: treat each product team as a mini startup.

That means EVERY role needed to develop the product was in that team. No separate corporate wide QA function, infrastructure and operations function, sales function, project management function, or domain expertise function. All the people performing those functions for that project were part of the project team.

Now this is somewhat hyperbole as if there is no sharing of resources whatsoever you don’t really have a single corporation.

But the idea is clarifying and helps to eliminate silos and tighten communication and feedback loops.

I miss that style of working. Although I try to break those barriers where I can as an individual contributor by just figuring out who needs to talk to who to make things happen and opening those channels of communication.

jwilliams an hour ago [ - ]

Agree.

We have a bunch of tools - specs, code, tests. All of these really are models of the end outcome we're trying to capture.

You could just build something, see if you're right and then build it again. If that seems ridiculous, what makes a spec special that it can work first time?

Why we've not done this historically is code is annoying and (was) relatively expensive. You can rough out a spec document and get feedback from a wide variety of stakeholders -- after all, they can all read a document.

If you can use AI to explore a problem space and get feedback directly, that's definitely a whole new tool in the kit.

jmalicki 4 hours ago [ - ]

This is also the part that AI speeds up the most for me, maybe 100x productivity.

I start with something like this prompt:

"This is a research project around <vague statement>. What do competitors, like <x>, <y>, <z> do around this, are there any blog posts or tech talks?

Are there any academic approaches or recent papers around the topic?

Can you survey any related open source projects? I know of <x> and <y>. Please include analysis of activity, github stars, number of downloads on npm/pypi/crates, and search the web for reviews or complaints or positive or negative blog posts from developers.

All claims should have links to the original sources, preferably with quoted text where appropriate.

We are going to write a research plan for how to produce this report.

The implementation of the plan will spawn subagents to survey breadth, then spawn subagents for each depth topic in detail"

harrall 14 hours ago [ - ]

Trying to figure out the best way to solve vague requirements is why I got into engineering.

If I got detailed specs, I’d just be a coding robot. I push that work off onto juniors.

hnthrow0287345 12 hours ago [ - ]

If they can't at least imagine the golden path themselves and write it down, they shouldn't be in charge of the product because they will be unlikely to understand any other in depth conversations about it. And I have no idea how they'd be having coherent conversations with anyone above them either. They're also unlikely to use AI well or not identify bad-out-of-the-gate solutions. It is of course different if they're just gathering opinions or want a PoC or exploratory work done, but those aren't requirements to me.

Developers are unlikely only doing development these days. There's ops and support to do as well, so more back and forth is less time doing those things and development.

We need to meet in the middle about requirements otherwise developers will end up doing someone else's job for them.

Cthulhu_ 14 hours ago [ - ]

I'm seeing decision-makers / people who write requirements starting to use AI as well in my day to day. As before, my job is to read, understand and test those requirements against the real world as I understand it. But same with code. Software engineering for the past (at least) 20 years has had a core focus of "don't trust anyone", this hasn't changed and this takes a lot of time and effort still.

Terr_ 13 hours ago [ - ]

The problem is that instead of trying to figure out what they really want/need, now we're trying to figure out what they really wanted or needed before it got obfuscated by the babble-machine.

rerdavies 2 hours ago [ - ]

Nailed it. I suspect the OP is a waterfall guy (despite the token references to agile). All the references to documentation is a big clue. When I see "documentation" and "development" running in parallel, as if that's an extraordinary thing, I mentally cross out "documentation" and replace it with "input from stakeholders", which, in an agile world is... YES!! Of course those run in parallel. That's the whole point of agile.

How do you translate "send an email to users" as a feature without a Document? ... also an incredibly waterfall thing. We Don't Do That Anymore. Thank goodness. Because it is incredibly inefficient (and not any less error-prone). And the chances that Some Guy who wrote the Document six months ago really understood the actual problem is...practically zero.

One of my favorite waterfall stories. A friend of mine who does contract programming for <big company>, who said that her projects were always delivered exactly on time, so you never had to apply the "double the estimates rule".

"So your projects always finish exactly on the delivery date original given?!" Incredulity!

"Oh no. They usually take twice as long, but the difference is that, first we deliver what they asked for (which arrives exactly on the original schedule date, but is completely unusable); and then we charge them 3 times as much to deliver what they actually wanted (which takes twice as long)."

getnormality 5 hours ago [ - ]

> Now most of the friction comes from alignment and coordination with other teams.

Then I see a solution! Why don't we simply put the entire company on one big team?

necovek an hour ago [ - ]

You joke but that's pretty much what cross functional teams are.

The only other observation is that as you grow teams, communication channels multiply exponentially and at over 6-8 people communication starts breaking down.

So instead you make small "companies" and set a few ground rules which software they build needs to follow, and you are back at a working org producing complex software.

jimbokun 2 hours ago [ - ]

Putting everyone responsible for some function of a product on one team, instead of having separate departments for separate functions, can do wonders for actually shipping and iterating on software.

thisisnotmyname 6 hours ago [ - ]

Not to mention that ai lets the domain experts create and test proof of concept implementations themselves. This alone has been a revelation for us and saves a tremendous number of design cycles.

thisisit 10 hours ago [ - ]

> we should reduce coordination overhead and empower individuals and teams to make decisions and execute on them.

Improved collaboration. Says every new CEO and manager. The notion that this is ever going to be solved especially with different experience, views, agendas etc needs to die too. AI is surely not going to help and with that roadblock iterating faster doesn’t help because then people want to try just for trying.

stingraycharles 14 hours ago [ - ]

Yeah I agree, such a fundamental aspect of software engineering is translating ambiguous “asks” into specific requirements. We now have a tool to convert those requirements directly into code.

And yes, architecture and how to actually implement the designs are also part of the requirements.

The code is just the implementation, the actual problem that needs solving is one abstraction level higher.

BloondAndDoom 8 hours ago [ - ]

We will see smaller and smaller teams where all this overhead is minimized by handful of people, and more than ever we will see 2-5 people teams creating great software

mmcnl 13 hours ago [ - ]

This is true, but funny thing is: it was also true before AI.

ModernMech 14 hours ago [ - ]

It's UML and outsourcing all over again: If only we can write the perfect UML diagrams representing the ideal class hierarchy, we can just put that in an email, send it to India, then we'll get back exactly the program we wanted, no mistakes!

gedy 13 hours ago [ - ]

> Trying to figure out what a vague, title only, feature request actually means.

> My take is that to accelerate processes we should reduce coordination overhead and empower individuals and teams to make decisions and execute on them.

This is funny because it's exactly what the agile/scrum training taught me 20 years ago.

nullsanity 42 minutes ago [ - ]

[dead]