This. This is the most important thing to consider: the available corpus the model was trained on. Remember that LLMs are inferring code. They don't "know" anything at all about its axiomatic workings. They just know what "looks right" and what "looks wrong". Agentic and RL are about to make this philosophy obsolete on grand scale, but signs still don't look good for being any to improve how much they can "hold in their head" to infer what token to spit out next from the vector embedding, tho.