I think your recommendation falls within
> all of them will have some strengths and weaknesses
Sometimes a higher parameter model with less quantization and low context will be the best, sometimes lower parameter model with some quantization and huge context will be the best, sometimes high parameter count + lots of quantization + medium context will be the best.
It's really hard to say one model is better than another in a general way, since it depends on so many things like your use case, the prompts, the settings, quantization, quantization method and so on.
If you're building/trying to build stuff depending on LLMs in any capacity, the first step is coming up with your own custom benchmark/evaluation that you can run with your specific use cases being put under test. Don't share this publicly (so it doesn't end up in the training data) and run it in order to figure out what model is best for that specific problem.