I guess, in theory, the prior distribution of language would allow for improved performance in some cases, especially where input quality is low.

This is already used in OCR, tesseract uses that.