An interesting idea for sure, but why only evaluate it on 28x28 pixel images? Why is their flow matching so much worse in some cases? Missing some analysis. Their words on it say nothing:
> For CIFAR-100 with one-hot embeddings, NoProp-FM fails to learn effectively, resulting in very slow accuracy improvement
In general any actual analysis is made impossible because of the lack of signal in the results. Fig 5 tells me nothing when the span is 99.58 to 99.46 percent accuracy.