To be more specific, basically all online streaming today is based around the concept of segmented video (where the video is already split into regular X-second chunks). If you only hardsubbed the typesetting while keeping the dialogue softsubbed (which could then be offered in a simpler subtitle format where necessary), you would only need to have multiple copies of the segments that actually feature typesetting. Then you would just construct multiple playlists that use the correct segment variants, and you could make this work basically everywhere.
You can also use the same kind of segment-based playlist approach on Blu-ray if you wanted to, though theoretically you should be able to use the Blu-ray Picture-in-Picture feature to store the typesetting in a separate partially transparent video stream entirely that is then overlaid on top of the clean video during playback.