Hacker News

I’ve spent most of my career in Job Tech and HR Tech so I can give you some insight.

LinkedIn lost a lawsuit about specifically trying to prevent scraping of their job content. This was a big deal at the time as all the major job sites were scraping jobs either because a customer lacked technical capabilities or to add jobs that were not part of the corpus. It’s a routine part of how the industry works in the same way that not many complain about Google scraping the Internet.

Secondly, and actually more importantly, a lot of the industry exchanges jobs using feeds. This is where CPC job distribution generally exists. These can contain “free” jobs which do not get any CPC credit but would let you build something like this site without any scraping infrastructure. You can ask some major job boards for just the free jobs if you wanted that for some reason. Most job specific scraping services like Aspen and Feedonomics will deliver you scraped jobs in the your preferred feed format.

In practice where I’ve worked we just blocked scraping sites when we get a complaint and respected robots.txt. It was rare for someone to complain since we were good sources of traffic wherever I worked. I am not a lawyer but my understanding is that as long as you’re not otherwise breaking the law by respecting legitimate takedown notices then scraping is fair use.

xitang a day ago [ - ]

Thanks for sharing the industry insights! Very cool to learn about CPC job distribution, etc.

I have also read the LinkedIn vs. hiQ Labs lawsuit. The ruling is significant because the court finds that scraping public data did not violate the CFAA, though it violated LinkedIn's tos. LinkedIn ultimately wins at the end because hiQ was bankrupted. One of my take away is to scrape responsibly by rate limiting the requests and not overloading the server, etc