This is cool. If you don’t mind me asking, are you using the EDGAR api or something else?

Thanks! No we (had some help) built the full pipeline from the ground up using EDGAR archives. Cleaning up SEC data was a nightmare but we eventually got there

Fascinating. It seemed like the EDGAR API looked too good to be true. Sounds like it was.

From what I gather, there is an "official" EDGAR API that is not operated by SEC but by a company they authorized, costs tens of thousands per year and by some accounts, is not that great.

Then you have a bunch of third party providers that offer APIs that return cleaned up SEC filings and charge like few thousand dollars a year. The downside is the latency they add.

Yeah, it's a SCP/SFTP service managed by a consulting company:

https://www.sec.gov/search-filings/public-dissemination-serv...