With 9M params it just repeats the joke from a training dataset.