vocabulary*

  *In the code above, we collect all unique characters across the dataset