None of what has been described is a "skill issue". The problem is when an identical prompt produces poor results once the context window exceeds 200k tokens or so.