I would categorize this in the "expertise that people internalize but never figure out how to verbalize" department, and that is a department we have no way to teach an LLM because if nobody is writing out those unspoken, subconscious rules then the LLM has nothing to read about them in its training data.

This is often called tacit knowledge. https://en.wikipedia.org/wiki/Tacit_knowledge

My favorite example of this is knowing how to untangle a big pile of cables. There are robots now which can untie a single knotted cable, but I don't think any can do a pile of cables yet. https://www.youtube.com/watch?v=vp-94rsherE

> and that is a department we have no way to teach an LLM because if nobody is writing out those unspoken, subconscious rules then the LLM has nothing to read about them in its training data.

I think on the contrary, LLM providers accumulate huge logs of interaction with their users, which elicit that tacit knowledge and mine it and humans cooperate willingly in order to solve their tasks. Just imagine the corpus of sessions for scientific research, education or software development, it is probably the largest such collection ever to exist. Trillions of HITL tokens per day flow into those logs, carrying our perspectives, choices, original ideas and tacit knowledge. I call this the "human-AI experience flywheel". It's the new stackoverflow, next model generation is based on interaction data from previous one.

Good point. Same probably applies to code as well, coders much tell us why they write the cde the way they did. And if they have comments in their code, those are highly untrustworthy because noboy fixes comments if the code works.