> that’s good enough as far as I’m concerned

But in that case, why an LLM. If we want Question-Answer machines to be reliable, they must have the skills which include "counting" just as a basic example.

The purpose of the LLM would be to translate natural language into computer language, not to do the calculation itself.