This is what GP eludes to, the original dataset has many similar reference images (i.e. the common mode is the same), and the LatentCSI model is tasked to reconstruct the correct specific instance (or a similarly plausible image in case of the test/validation set)