Can't agree more on 5. I've repeatedly found that any really tricky programming problem is (eventually) solved by iterative refinement of the data structures (and the APIs they expose / are associated with). When you get it right the control flow of a program becomes straightforward to reason about.

To address our favorite topic: while I use LLMs to assist on coding tasks a lot, I think they're very weak at this. Claude is much more likely to suggest or expand complex control flow logic on small data types than it is to recognize and implement an opportunity to encapsulate ideas in composable chunks. And I don't buy the idea that this doesn't matter since most code will be produced and consumed by LLMs. The LLMs of today are much more effective on code bases that have already been thoughtfully designed. So are humans. Why would that change?

Agreed, in my experience, rule 5 should be rule 1. I think I also heard it said (paraphrased) as "show we your code and I'll be forever confused, show me your database schema and everything will become obvious".

Having implemented my shared of highly complex high-performance algorithms in the past, the key was always to figure out how to massage the raw data into structures that allow the algorithm to fly. It requires both a decent knowledge of the various algorithm options you have, as well as being flexible to see that the data could be presented a different way to get to the same result orders of magnitude faster.

I think you are referring to:

"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)

I have seen a huge decline in data first over the past decade-plus; maybe related to a lot more pragmatic training where code-first and abstraction helped you go faster, earlier but I definitely came of age starting with the schema and there are an awful lot of problems & systems that essentially are UI and functions on top of the schema.

UI + functions on top of schema if you've designed the schema well. Otherwise, it's a whole other thing.

I don't think they were ever meant to be in order of importance.

> refinement of the data structures (and the APIs they expose / are associated with)

I think rule 5 is often ignored by a lot of distributed services. Where you have to make several calls, each with their own http, db and "security" overhead, when one would do. Then these each end up with caching layers because they are "slow" (in aggregate).

If you're doing it right, you start with a centralized service; get the product, software architecture, and data flows right while it's all in one process; and then distribute along architectural boundaries when you need to scale.

Very few software services built today are doing it right. Most assume they need to scale from day one, pick a technology stack to enable that, and then alter the product to reflect the limitations of the tech stack they picked. Then they wonder why they need to spend millions on sales and marketing to convince people to use the product they've built, and millions on AWS bills to scale it. But then, the core problem was really that their company did not need to exist in the first place and only does because investors insist on cargo-culting the latest hot thing.

This is why software sucks so much today.

>> If you're doing it right, you start with a centralized service; get the product, software architecture, and data flows right while it's all in one process; and then distribute along architectural boundaries when you need to scale.

I'll add one more modification if you're like me (and apparently many others): go too far with your distribution and pull it back to a sane (i.e. small handful) number of distributed services, hopefully before you get too far down the implementation...