Hacker News

mattlondon 2 days ago [ - ]

I think these things are mainly based on cookie/fingerprinting these days - the check-box is just there for show. People like cloudflare and google get to see a big chunk of browsing activity for the entire planet, so they can see if the activity coming from an IP/Browser looks "bot like" or not.

I have never used ChatGPT so no idea how its agent works, but if it is driving your browser directly then it will look like you. If it is coming from some random IP address from a VM in Azure or AWS even then the activity probably does not look "bot like" since it is doing agentic things and so acting quite like a human I expect.

seanhunter 2 days ago [ - ]

Agentic user traffic generally does not drive the user's browser and does not look like normal user traffic.

In our logs we can see agentic user flow, real user flow and AI site scraping bot flow quite distinctly. The site scraping bot flow is presumably to increase their document corpus for continued pretraining or whatever but we absolutely see it. ByteDance is the worst offender by far.

nicewood 2 days ago [ - ]

It might look like you initially, but then some sites might block you out after you had some agent runs. I had something like this after a couple local browser-use sessions. I think simple interactions like natural cursor movements vs. direct DOM selections can make quite a difference for these bot detectors.

mattlondon 2 days ago [ - ]

Very likely. I suspect a key indicator for "bots" is speed of interaction - e.g. if there is "instant" (e.g. every few milliseconds or always 10milliseconds apart etc) clicks and keypresses etc then that looks very unnatural.

I suspect that a LLM would be slower and more irregular as it is processing the page and all that, vs a DOM-selector driven bot that will just machine-gun its way through in milliseconds.

Of course, Cloudflare and Google et al captchas cant see the clicks/keypresses within a given webpage - they'll only get to see the requests.