My career path is suprisingly similar to the author's. Weirdly enough, what he takes as the first pillar to fall is the one I see most undamaged currently.

LLMs routinely fail at our business specifics: Local tax regulations, particularities of the accounting process, specifics of our ledger implementations. They're great at refactoring, translating between languages, tracing bugs on existing code even, but there is always many things subtly wrong iterating and expanding our domain.

This might be because the companies I worked for happen to be tackling complex domains precisely for moat-building reasons. They stay in business explicitly because there's not a book out there you can read to build a clone, the knowhow stays inside.

Also, a fintech whose managers recommend speeding up design docs with AI sounds way too careless to be in the money handling business. It's way, way too easy to end up with millions incorrectly allocated, particularly if you deal with high volumes of small transactions. These bugs are always a bitch to deal with because correcting the logic is just step one, you then have to correct all the wrongly calculated data in immutable DBs, move around the red tape and client comms, and your fix is bound to become a gotcha that new features and observability have to take into account ("remember that there's a bump in the data in february 2 because we had incident X".)

This. Once you're building something that genuinely hasn't been built before, LLMs cannot be trusted with any architectural decisions. I'm building a product based around various physics simulations, so it's purely first principles, but without active research, thinking, and challenging, it produces computational code literally hundreds of orders of magnitude slower WHILE implementing absurd fallbacks and shortcuts that effectively result in a useless calculation.

This is the case perhaps 95% of the time.

Oversight is very important, and architectural thinking cannot yet be outsourced, only execution.

I have had similar when trying it too. I couldn't even drive Claude Opus 4.7 to get PETsc to compile properly (with all the optional dependencies)

I find that too, Claude Code is constantly trying to break the architecture patterns and do hacky stuff

Like its only focused on solving the local problem as easy as possible

[dead]

LLMs routinely fail at our business specifics: Local tax regulations, particularities of the accounting process, specifics of our ledger implementations.

This is domain expertise - software engineers are not needed for that. Ofc often senior sws are expert in it, but they aren't necessary.

Traditionally its been useful for frictionless production to have engineers to be able to do maybe 90% of their work without consulting the business experts but this is the whole crux of the moment TFA discusses - "tradition" is over.

In this new world its now the job of a senior engineer not to have this domain expertise themselves, but to know how to ensure the agents have it, or can acquire it and it be verifiably correct.

Senior engineers who hang on to the idea that their advanced business domain expertise makes them safe will soon be as dead in the water as juniors who haven't pivoted.

I can't even get Claude or GPT-5 to consistently produce good flows for common use cases, much less domain-specific shit. They have deep vocabulary though, which makes them sound better informed than they are.

They are very good at writing code and debugging visible errors- but that's like 50% the harness.

> LLMs routinely fail at our business specifics: Local tax regulations, particularities of the accounting process, specifics of our ledger implementations.

Would a skill which forces you and LLM to reach a shared understanding of the product features and the regulations those features are supposed to capture be of help here? The main idea is we provide documents to the LLM and it asks lot of questions which clear ambiguity and possible misconceptions the LLM might have. I would suggest please take a look at skills. They are really helpful.

https://www.youtube.com/watch?v=6BB6exR8Zd8

> The main idea is we provide documents to the LLM and it asks lot of questions which clear ambiguity and possible misconceptions the LLM might have

This kind of works but the difficulty is that you have to be very explicit about everything. It was mentioned in a spec document that a particular excel file is treated as a source of truth throughout the whole company and it is treated as an append only database. The agent still decided to add a check to see if a previous row was modified. It pushed back on its decision when asked why it decided to do so. "What if someone entered it wrong and had to correct it"? Valid question but it's not my teams responsibility to check for it

This check makes sense from a traditional development view point and that's why the agent did it. I would say it's good practice too but it's beyond the scope of the project it was working on. If what you are doing is beyond the norm you have to watch out for things like this

Sure but finding their shortcomings and patching them with skills takes real trial and error. They are incapable of identifying their own shortcomings for you.

>> LLMs routinely fail at our business specifics: Local tax regulations, particularities of the accounting process, specifics of our ledger implementations.

My company also deals with a lot of complex regulations and domain-specific system implementations, which AIs used to struggle with. We were able to solve the problem with well-organized claude.md/agents.md files. On top of that we also implemented supermemory.ai, so newly made decisions are always recalled by AI agents when starting new sessions.