Given it's an overfitted transformer isn't is still terribly inaccurate/is liable to add random bytes here and there?

If I were only using a transformer that would have been true, but we use arithmetic coding alongside our transformer to fix those mistakes (layman terms). You can read about arithmetic coding, its a pretty cool topic.