I think there is a flow in most organizations from:

llm -> prompt -> result

llm -> prompt + prompt encoded as skill -> result

llm -> prompt + deterministic code encoded as skill -> result

I do think prompting to generate code early can shortcut that path to deterministic code, but we're still essentially embedding deterministic code in a non-deterministic wrapper. There is a missing layer of determinism in many cases that actually make long-horizon tasks successful. We need deterministic code outside the non-deterministic boundary via an agentic loop or framework. This puts us in a place where the non-deterministic decision making is sandwiched in between layers of determinism:

deterministic agentic flows -> non-deterministic decision making -> deterministic tools

This has been a very powerful pattern in my experiments and it gets even stronger when the agents are building their own determinism via tools like auto-researcher.