I want to stick my neck out to say that this has the potential of being very bad for science.

Imagine saying "no" to a researcher with a big social media profile. Imagine 4chan coming at you with style-detection and deanonymization tools simply because their favorite racist or antivaxer got their nonsense rejected and sent their followers after you. And this is not just me feeling this way - quoting myself from a previous comment, and according to the ACL's 2019 survey [1], "female respondents were less likely to support public review than male respondents" and "support for public review inversely correlated with reviewing experience".

A measure that women ~~and inexperienced researchers~~[2] do not support is a measure that favors only those who are already part of the club.

[1] Original here (currently offline): http://acl2019pcblog.fileli.unipi.it/wp-content/uploads/2019..., summary here: https://www.aclweb.org/adminwiki/images/f/f5/ACL_Reviewing_S...

[2] This part has been correctly pointed out as being wrong.

> Imagine saying "no" to a researcher with a big social media profile

"The identity of the reviewers will remain anonymous, unless they choose otherwise — as happens now."

(Also "support for public review [being] inversely correlated with reviewing experience" means inexperienced reviewers are more likely to support it. Not less.)

You are correct about the second point - I'll strike it through once I find out how.

As for the anonymous part, that's why I wrote "with style-detection and deanonymization tools". If the Internet could find Shia Labeouf's flag in a day [1], could they find a reviewer based on their writing?

[1] https://www.dailydot.com/unclick/4chan-shia-labeouf-secret-l...

The difference is that as a scientific reviewer you are not hiding a physical location and what you need is plausible deniability, which would still exist. In addition to this, actively attempting to deanonymise your reviewers is on the level of scientific misconduct that your employer and professional organisation should consider taking disciplinary action against you. I am not arguing that this makes it entirely safe to publish anonymised reviews and that we will not affect reviewer behaviour (maybe for the better in some cases, as "one-sentence reviews" will be something in the public record), but it is in stark contrast to the example that you bring up.

Plausible deniability will satisfy who exactly? The Sherlock Holmes of Reddit did a great job with the Boston marathon bombing.

You don’t have to actively demonetize your reviews, you just have to prime your audience ahead of time that your ~~election~~ publication was stolen and they’ll do the rest.

This is giving tinfoil hat vibes - I would guess style detection has such a high false positive rate as to be near useless. Also there's nothing stopping people from publishing reviewer comments today and letting the Internet run wild with "style detection" (or doing it themselves).

This is a concern I have as an "anonymous" HN account, even up to putting it in quotes despite never revealing my name or strong PII. But the language I use as a reviewer is pretty different, even from that of a writer. I suspect this will be harder due to low sample rates but then again, high noise could help your point. Mobs need targets more than accuracy.

Though, we do eventually need to have a conversation about deanonymization of online accounts. That's not a thing we want to be done so easily

Translate into $RANDOM_LANG and then back to English. The perfect prose obfuscator.

Or get chatGPT to reword it in a different style of writing

[deleted]
[deleted]

Has there been recent developments in the style detection and deanonymization tools you mentioned? I would assume many would not work well given the high usage of LLMs nowadays.

[deleted]

What is the alternative?

Or are you just for creating classes of people that just can't be critiqued in any circumstance?

This kind of sounds like, 'Wont anyone think of the grifters?!'

I know every area is different but the "grifters" in the area of Computational Linguistics (the ACL) are "any volunteer[1] whose paper has been accepted at least once", meaning anyone from PhD students to professors and industry researchers.

Not all academia is Elsevier.

[1] This policy has been altered recently, though, and now submitting a paper comes with reviewing duties.

We are struggling badly with review quality in natural language processing though. Most likely due to the unprecedented expansion of the field over the last ten or so years. Reviewers are suffering with review loads far exceeding what one reasonably can manage mentally (used to be two to three papers per reviewer and now five would be considered rather generous). Authors and area chairs suffer from worse quality reviews due to reviewer inexperience and overload, not to mention how good reviewer/author correspondence with author and area chair comments frequently being ignored by the reviewers. To me, the last holdout of good peer review in the field is Transactions of the Association for Computational Linguistics (TACL), but there the acceptance bar is sky high compared to ACL Rolling Review (ARR) for better or worse.

The ACL leadership and senior members of the field are very much aware of this and are trying their best (ARR being an attempt to improve the situation, but I am unsure how much better it really is compared to the old system of conference reviewing now that we are a few years in). But there appear to be no easy fixes for a complicated, distributed system such as peer review. Every discussion I have with said leadership and other senior members always ends with us agreeing on the problems and likewise agreeing that despite considerable mental effort we are failing to come up with solutions.

Returning to the main topic. Nature is worthy of praise for making their peer review transparent and I say this as a massive Nature critic. It is a move I loved seeing from NeurIPS (then NIPS) and ICLR over a decade ago, as it helps younger researchers see what good (and bad) communication looks like and that even papers they know now are greatly appreciated received a fair amount of criticism (sometimes unwarranted). I have argued for ACL to introduce the same thing for nearly a decade at this point, but we still do not and I have never heard a solid argument as to why not (best argument was the technological effort, but OpenReview, with all its flaws, makes this even easier than with Softconf; not that it would have been that hard with Softconf either).