Scrape once, spam forever.
I think it's pretty clear you need to use an anonymization scheme in the way commits are handled so that it links back to your github account and the email addresses are kept private.
Privacy centric companies like Apple do this for users offering hashed emails, on a per login basis.
I'm sure this would not work in a world of scraping, but having that kind of ability to figure out bad actors would be nice. You could require authenticated users for certain kinds of requests, and block user information from non-authenticated requests.
They already do[0]
this includes a unique ID which survives account renames, and the name of the GitHub account at the time.[0] https://docs.github.com/en/account-and-profile/reference/ema...
How does the spammer get through this then?
If used/implemented correctly, they don't.
* This is an optional feature via git config, with a further GitHub setting to reject commits pushed using your personal email address.
* If the GitHub setting is disabled, some GitHub-generated commits/fixups use your personal email (e.g. squash merge in the GitHub UI).
* I use my personal email in file-level copyright headers, even if the commits use the GitHub noreply email.
* I have my personal email on my GitHub profile, visible to logged in users.
they don't. it's an optional process, and many users don't change their git config to use the provided email