I've seen this approach applied to spectrograms. Convolutions do make enough sense there.