> scraping LinkedIn profiles
is this legal? last time I checked linkedin.com/robots.txt do not allow scraping, unless explicit approval from linkedin
> scraping LinkedIn profiles
is this legal? last time I checked linkedin.com/robots.txt do not allow scraping, unless explicit approval from linkedin
If it is publicly available information it is legal to scrape it, regardless of what robots.txt says.
See: https://www.webspidermount.com/is-web-scraping-legal-yes/
As an attorney (and this is not legal advice), I don't think it's quite that simple. The court held that the CFAA does not proscribe scraping of pages to which the user already has access and in a way that doesn't harm the service, and thus it's not a crime. But there are other mechanisms that might impact a scraper, such as civil liability, that have not been addressed uniformly by the courts yet. And if you scrape in such a way that does harm the operator (e.g. by denying service), it might still be unlawful, even criminal.
There's a relevant footnote in the cited HiQ Labs v. LinkedIn case:
"LinkedIn’s cease-and-desist letter also asserted a state common law claim of trespass to chattels. Although we do not decide the question, it may be that web scraping exceeding the scope of the website owner’s consent gives rise to a common law tort claim for trespass to chattels, at least when it causes demonstrable harm."
They also said: "Internet companies and the public do have a substantial interest in thwarting denial-of-service attacks and blocking abusive users, identity thieves, and other ill-intentioned actors."
It's a good idea to take legal conclusions from media sites with a grain of salt. Same goes for any legal discussion on social media, including HN. If you want a thorough analysis of legal risk--either for your business or for personal matters--hire a good lawyer.
Smart
Or run your legal questions through a frontier model and then have a lawyer verify the answers. You can save a lot of money and time.
Yes, all LLM caveats apply. Due your diligence. But they are quite good at this now.
Have you actually tried this approach? I’m curious as to the result, especially when you took it to your lawyer. Not a contract review but a business practice risk evaluation.
Some context from coverage of GPT 5:
https://legaltechnology.com/2025/08/08/openai-launches-gpt-5...
https://www.artificiallawyer.com/2025/08/08/gpt-5-tops-harve...
Remember when "asking for a friend" was a thing?
Today's expression is "I asked a friend". You can try that when talking to your lawyer about your latest ChatGPT — they might still believe you.
Hmm this is a good idea too
what a nonsense. they explicitly say "do not scrape us, unless we approve". they put paywalls and captchas. their service is literally selling access to users data.
now you scraping it. this is direct violation and direct harm to their business, despite their explicit statements for you to stop.
you loose the case, it is clear as day.
what a nonsense. this is equivalent of "sovereign citizens" online. go and try it, and get yourself into jail.
Do not confuse strong language with strong argument. Yours is the former not the latter.
LinkedIn has api. So why to scrap?
because they are pulling what they are not supposed to. they are doing it illegally. that's why.
> they are doing it illegally.
ToS aren't real laws, mate.
Edit: oops, just saw a message from the creator of this thing saying he gets the data in the most illegal possible ways. They have no salvation.
It is possible to do what they propose legally tho the "agent" is just the users computer.
ToS are leagally binding contracts. there are there for a reason.
contracts are not laws themselves. but correctly done ToS (I bet LinkedIn does) hold very real legal power.
We get our data from third party data vendors who we assume have gotten explicit approval from linkedin!
You assume! Such due diligence!
Unfortunately not able to get into their codebase
Or yours...
What would you like to see?
Can tell you :)
you're building a tool that is designed to sink its tentacles into peoples' most personal accounts and take unsupervised automated actions with them, using a technology that has serious, well known, documented security issues. you haven't demonstrated any experience with, awareness of, or consideration for the security issues at hand, so the ideal amount of code to share would likely be all of it.
Fair enough makes sense to not have trust!
We like to believe we're pretty trustworthy, and do our best to make everything secure.
i actually really like your product for what it's worth. don't listen to the haters. hackers build things.
i just won't use it, and nobody should, unless they can understand exactly how it works and reason for themselves about the risks they are taking. you clearly work hard and care deeply about what you are building, and it will be very useful. but it has the potential to cause widespread harm, no matter how trustworthy you are, how much you care about it, or what your intentions are.
with respect to user security and privacy, doing your best is not much better than yolo security. the minimum standard should be to research the threat landscape, study the state of the art in methods to mitigate those threats, implement them, and test them thoroughly, yourselves and through vendors. iterate through that process continuously, alongside your development. it will never end. or, you can open source it and the internet does this for you for free. build something people love, grow traction, convert that to money. THEN figure out how to make money from them.. not the other way around. or, more likely, some combination of all of the above.
someone else linked you to simon wilson's lethal trifecta page, i would absolutely start there, and read everything linked as well. pangea and spectreops both do good work in the llm pentesting space, i'm sure there are more.