I'm just playing devil's advocate here.

Yes, but still "how to cook" is not atomic. It involves knowing how to move stuff, how to measure, what "cooked" looks like in different environment (i.e. different lighting) or variations in ingredients, how to recover from specific failures (i.e. a good cook can fix accidentally adding too much salt, by counter-balancing with an ingredient that absorbs the extra salt). And this is only one skill.

It's a bit how deep image neural nets work, where simply detecting shape primitives is not enough, the net is also the connection and relation between those primitives.

Even saying, the AI should just have the "cooking" or "coding" skill, trivializes the problem.

> Humans spend 99% of their life on boring repeating tasks

But we are also non-stop unconciously learning about the world non-stop, from the analgous stream of inputs and seeing the immediate result/feedback. Even looking at static picture is like over-training a specific dataset.

Just boiling water would be difficult. Do I just add heat until I see bubbles? Or should I have a world model in which I understand that boiling water will be of varying temperatures at varying altitudes and given different liquids.

Because if the recipe just says "boil for 10 minutes" but the thing being cooked really needs a temperature of 212F for 10 minutes, the thing isn't going to be cooked if you're not actually at 212 for 10.