Points 2 and 3 are irrelevant.

Point 1 is saying results may not generalise, which is not a counter claim. It’s just saying “we cannot speak for everyone”.

Point 4 is saying there may be other techniques that work better, which again is not a counter claim. It’s just saying “you may find bette methods.”

Those are standard scientific statements giving scope to the research. They are in no way contradicting their findings. To contradict their findings, you would need similarly rigorous work that perhaps fell into those scenarios.

Not pushing an opinion here, but if we’re talking about research then we should be rigorous and rationale by posting counter evidence. Anyone who has done serious research in software engineering knows the difficulties involved and that this study represents one set of data. But it is at least a rigorous set and not anecdata or marketing.

I for one would love a rigorous study that showed a reliable methodology for gaining generalised productivity gains with the same or better code quality.