As a (former) reporter, site monitoring is a big part of what I do on a daily basis and I used many, many such services.
I can attest that, at least from the landing page, this seems to be a very good execution of the concept, especially the text-based diffing to easily spot what changed and, most importantly, how.
The biggest hurdle for such apps however are 'js-based browser-rendered sites' or whatever they're called nowadays. How does Site Spy handle such abominations?
Thanks, that’s a really good question. Site Spy uses a real browser flow, so it generally handles JS-rendered pages much better than simple HTML-only polling tools. In practice, the trickier cases tend to be sites with aggressive anti-bot protection or messy login/session flows rather than JS itself. I’m trying to make those limitations clearer so people don’t just hit a vague failure and feel let down
Curious how you're thinking about getting around anti-bot protection. I scrape a lot and I've noticed many highly trafficked sites investing in anti-bot measures recently, with the rise of AI browsers and such. Still, cool idea, congrats on the launch.
I'm planning to add proxy rotation across different regions to help with geo-restricted content and rate limiting. Anti-bot is an arms race though — some sites just can't be monitored without solving a captcha, which isn't something I'm trying to do. Focused on making the common cases work well rather than promising to bypass everything.
JS-rendered websites are sometimes even better, they usually have some sort of internal API that you can access directly instead of relying on the website styling which may change at any moment.
I'm a fellow reporter who needs to keep tabs on some websites. I used various tools, including running my own Klaxon[1] instance, but these days I find it easier to just quickly vibe-code a crawler and use GitHub Actions to run it periodically. You can make it output an RSS feed, email you, archive it with archive.today, take a screenshot, or trigger whatever action you want.
1: https://github.com/themarshallproject/klaxon
What do you do now if you don’t mind sharing?