I think the difference is that with a human you can say something ambiguous like "handle error cases" and they are going to put thought into the errors that come up. The LLM will just translate those tokens into if statements that do some validation and check return values after calls. The depth of thought is very different.
But that is just a difference of degree, not of kind.
There is a difference between a human and an ai, and it is more than a difference of degrree, but filling in gaps with something that fits is not very significant. That can be done perfectly mechanistically.