That's precisely what I'm doing. I'm splitting by sentences, and then for each sentence that's still too long, I split them by natural breakpoints like colons, semicolons, commas, dashes, and conjunctions, and if any of /those/ are still too long, I then break by greedy-filling words. Then I do some fun manipulation on the raw audio tensors to maintain flow.