A big ongoing challenge is how to automate human actions on websites - specifically those that don't offer any API for the data needed, and and make it as hard as possible to scrape them. Almost every data job I've had, have had such part time projects going on internally, typical "How do we automate data extraction from [x] site" where the websites either refuse to provide any services, or simply can't (don't have the resources). And up until now it has been some sort of RPA / Robotic Process Automation problem.
I'm not talking about nefarious motives for doing such either, for our part it is just tedious task where humans spend too much time filling in forms / clicking on UI components, and doing the download manually.
So while letting agents run wild on problems like this can (and surely will) lead to abuse, it will likely also free up so much time for the people doing such tasks for actual work.
> or simply can't (don't have the resources). And up until now it has been some sort of RPA / Robotic Process Automation problem.
Wouldn't it be like half the cost to your organisation if you do it for them? If govt agency xyz doesn't have the resources to build this, offer to make it, get access to the source, plug it into a dead simple API, get your data and everybody's happy?
I've never held a data analysis position so I have no idea if such an org would be open to it. If not, it sounds more like the former issue (gatekeeping and unwillingness) than the latter (inability or resource lack)
No, because then they would realize their data is worth money.
The person above said they were approaching these places already, so latest at that time, they've realised that
The site wont give you a dead simple API.
I'm proposing to make that, not to be given it. I'm not sure if we're understanding each other