There's plenty of evidence that good prompts (prompt engineering, tuning) can result in better outputs.
Improving LLM output through better inputs is neither an illusion, nor as easy as learning how to google (entire companies are being built around improving llm outputs and measuring that improvement)
Sure, but tricks & techniques that work with one model often don't translate or are actively harmful with others. Especially when you compare models from today and 6 or more months ago.
Keep in mind that the first reasoning model (o1) was released less than 8 months ago and Claude Code was released less than 6 months ago.
Yes, though that just means the probability of success is a function of not only user input but also the model version.
Slot machines on the other hand are truly random and success is luck based with no priors (the legal ones in the US anyways)