Hello, lead author here. First: you are right! A surrogate model is a fancy interpolator so, eventually, it will just be as good as the model it is trying to mimic, not more. The piece that probably got lost in translation is that the codes we are mimicking have some accuracy settings, which sometimes you can't push to maximum because of the computational cost. But with the kind of tools we are developing, we can push these settings when we are creating the training dataset (as this is cheaper than running the full analysis). In this way, the emulator might be more precise than the original code with "standard settings" (because it has been trained using more accurate settings). This claim of course needs check: if I am including an effect that might have a 0.1% on the final answer but the surrogate has an emulation error of order 1%, clearly the previous claim would not be true.