There are APIs such as Jina AI's reader API that do this pretty well. It doesn't produce output as clean as Sosumi for Apple docs, but it's free and does a decent job.

https://jina.ai/reader