Thanks! No we (had some help) built the full pipeline from the ground up using EDGAR archives. Cleaning up SEC data was a nightmare but we eventually got there

Fascinating. It seemed like the EDGAR API looked too good to be true. Sounds like it was.

From what I gather, there is an "official" EDGAR API that is not operated by SEC but by a company they authorized, costs tens of thousands per year and by some accounts, is not that great.

Then you have a bunch of third party providers that offer APIs that return cleaned up SEC filings and charge like few thousand dollars a year. The downside is the latency they add.

Yeah, it's a SCP/SFTP service managed by a consulting company:

https://www.sec.gov/search-filings/public-dissemination-serv...