That's a counterargument to a different thing.
Iteratively measuring loss is a way to reconstruct values. That's trivial to show for a single value If 5 gives you a loss of 2 and 9 gives you a loss of 2 then you know the missing value is 7.
A model with enough parameters can memorise the training set in a similar manner. Technically the model hasn't seen that data by direct input either, but the mechanism provides the means to determine the what the data was. In that respect it is reasonable to say the model has seen the data.
Performing well on examples not in the training set is doing something else.
Any attempt to characterise that as having been seen before negates any distinction between taking in data and reasoning about that data.
Yea, because "seeing" is also tweaking the parameters. Which this example is doing manually.
So I don't understand how any one can make the claim that the model as not seen it. Because the internal transformation is similar.
You are going to have to be more specific, because that reads like nonsense.
By what mechanism do you propose the model observed the test set?
>By what mechanism do you propose the model observed the test set..
By explicitly setting the model parameters.
What happens when a model is trained? We tweak the model parameters by some feed back.
In both cases, you affect the model parameters. Only the method is different. So both are eqvialent to "model observing the test set".
I still do no see any causal link from the test set. When was this observed, how and by whom?
Are you trying to say that the person who entered the parameters had access to the test set? I find it more likely that they encoded the generalising rule than observed every instance of its use.
>I find it more likely that they encoded the generalising rule..
Look, I am saying that during training the model ends up "learning" the generalising rule from training data, but here it was explicitly entered into it, with out any training.