A very prolific coworker who fully embraced claude has inflicted the team with a flood of AI-generated PRs. About six months later, it is his frequent bemoaning at the standup that their PR don't get reviewed, languishing in inattention. I don't think anyone - including myself - _intentionally_ avoid his PRs. It's just that he doesn't make it easy for the team to look at.
This single headline perfectly captures what I have been thinking. It's not that I reject AI content, but it takes _effort_ to review and weed out any mistakes. When your thoughtful reviews that take an hour(because the PR is typically large, and you want to be _right_ when you're pointing out a hallucination) gets an AI-generated response with AI-generated amendments, It doesn't feel _nice_. I feel dismissed and it has continuously trained me to subconsciously avoid his PRs. After all, the team is fully onboarded with AI, so it's not like there is a lack of PRs to review.
It looks like the sentiment isn't just isolated for me.
As someone who pushed ~4x the median PRs on my team before LLMs were a thing, I kind of think the problem here is PRs as a concept. Code review doesn't scale to prolific humans, it definitely can't scale to agents.
And the exact same things you would need to safely give up on PRs for human developers (auto-formatters, linters, comprehensive end-to-end tests, continuous deployment pipelines, etc), are also things that place meaningful guardrails on LLMs, and help them maintain a reasonable quality bar.
> Code review doesn't scale to prolific humans
If that's genuinely your attitude then your org has a problem.
Code review is slow and less fun, for the average sw eng. But for high quality work it's indispensable. So treat code reviews as a scarce resource. Optimize for code reviewer time and attention. Have your PRs the right size? Are they well described? Do you give context? Do they fit in the bigger story? Do you mix in unrelated drive-by fixes? How easy is it to deal with you once you have received comments? Do you address them promptly? Do you give your reviewers credit (if not praise) for their help? Do you give back by doing code reviews yourself with high quality feedback? There are lot of things you can do to streamline things and give code reviews the place in a teams workflow that it deserves.
It's clear they consider code review a personal activity than team activity, in the sense that they think "code review is a gate before my code can be merged" rather than "code review is a process where the team discusses, understands and improves the code".
And that's not rare in teams. Lots of teams and developers do code review wrong.
I even hear other people complain that I "block" their code review. I mean, if there are issues in your code, of course I am going to flag them, what do you think the purpose of code review is?
> Lots of teams and developers do code review wrong
In this sense, I'm not sure I've ever seen a team that does codereview "right".
In the before times, most PR feedback was stylistic, with the occasional bug identified. Now that we have ubiquitous auto-formatters/linters/CI, most PR review falls into either "you misunderstood the spec", or "I disagree with your architectural choices" - and my personal feeling is that your process ought to catch both of those well before the PR stage
> most PR feedback was stylistic, with the occasional bug identified.
I think that only speaks for your own experience. I have definitely seen more than a few PRs that needed significant work.
Yeah, that's fair. I have spent most of my career on high-pressure teams within FAANG, where we aggressively managed-out anyone who wasn't making the grade. And now in the startup world, we apply a very aggressive hiring bar.
I'm not sure how much I'd enjoy working on teams who were routinely producing PRs that were in bad shape.
How many teams did you see?
On your original claim, I have seen engineers put up 5x more PRs simply because they paid less attention to the quality or put less thought on each one of them.
I have seen people put up 5x more quality PRs too. But as long as they follow the good practice of doing a code review for every PR they put up (or 2 if you require 2 per PR), they got their stuff through quickly as well.
> your process ought to catch both of those well before the PR stage
We have multiple points where mistakes of any sort can be caught, and code review is one of them.
Yes, most architectural issues should be caught earlier, but some will only become evident in code: some by the dev themselves, others by reviewers.
This is only a problem if you mostly catch architecture issues at code review phase.
Not my experience and especially for juniors reviews were an excellent tool to learn and get mentored.
> But for high quality work [a code review is] indispensable.
The argument here is that all code reviews are done with attention and care, but quality of a code review is highly dependent on the reviewer and the team’s review process, and in the real world the quality of reviews pretty much follow the same distribution curve as, say, agile project management: For the time invested in reviewing, a handful of teams get excellent utility from them, most teams get little benefit, and a sad few actually cause harm.
If most code reviews provide only a little benefit at base for most teams, recommending that most teams should also delay shipping quality work is going to sound a lot like bad advice.
> Have your PRs the right size?
I’ve noticed that large PRs aren’t just a problem for human reviewers: they’re a problem for AI reviewers too.
If I submit a 100 line PR I’m likely to get some useful comments back from both humans and LLMs. In fact the LLM is likely to come back with so much feedback it gets down to the nitpicky/annoying level.
If I submit 1000+ lines in my PR, the humans either don’t have time and/or get scrolling blindness, and the AI reviewer is likely to give me a response that amounts to, “<<slaps roof>> Looks good to me bro: ship it!”
I guess they have a limited token budget for reviews so you can bamboozle them simply by blowing most or all of that budget.
The flip side of this tends to be that if 1,000 lines of code need to happen, filling the review queue up with 10x PRs each of 100 lines isn't exactly great either. The author spends a bunch of extra effort producing a raft of atomic PRs, and the reviewers get to context-switch a whole bunch (and may not end up with a clear picture of the feature end-to-end).
I think the ultimate answer to this is a stacked PR workflow (which we had at Meta), where I can cheaply maintain/review a 1,000 line PR as a stack of 10 incremental PRs. But unfortunately GitHub et al are still not quite there on this one.
I agree about how you can reciprocate for a good code review, but I'd just add that for me, code review is also fun — when done for a fellow human who I might be teaching.
It is definitely very grunt-like for an LLM.
Most orgs have a problem with quality unless it is enforced by government requirements for certifications and such.
Code reviews, documentation, static analysis, only retrieving deps from internal repos, unit tests, integration tests, ....
Especially in domains where shipping software is not the main product, and a plain cost center to the main business of physical goods.
Gently, as long as you work with humans, you should consider yourself working _for_ those humans. Everyone needs shared state to work from, and that's just the cost of doing business.
That said, sometimes low-trust environments are the issue, not PRs. In a higher trust environment, PR review is a helpful thing you usually desire, not dread.
> In a higher trust environment, PR review is a helpful thing you usually desire, not dread
Respectfully, in a high-trust environment, feedback should be delivered well before the PR stage. If you've let someone write a whole bunch of code without having a shared understanding of how the solution should work, you may have earlier process issues that PRs are papering over
You cannot deliver feedback on something that doesn't exist. If you mean a review in the style of "all of this is wrong and needs to be rewritten differently" then yes, that's something to be discussed beforehand. But I don't imagine this is what people think of when discussing a review.
Depends on how PRs function within teams. For some, the PR is a lightweight thing that is the preferred method of communication. It sounds like you are imagining a case where face to face communication, or communication over chat, is preferred for early stages, with the PR being a nearly final artifact. But it doesn't have to work like that.
I think that's a valuable point. Especially as LLMs bring the cost of prototyping down (and reduce emotional investment in code written), it may be more viable to use PRs as proposals/sketches of a solution.
With human reviewers, I find that by the time someone has churned out enough of a solution to post a PR, they are already quite invested in specifics of the solution, and it makes it emotionally costly (to both author and reviewer) when someone says "hey, I'm not a fan of this whole approach, lets start over and do it this other way"
I have seen many a PR where it is obvious it is an exploratory work: eg. figuring out how to use an external dependency that is imperfectly or incorrectly documented, etc. (You can claim this should be done ahead of time, but experience tells me you need to code it to learn it)
The emotional toll there is real, but this is exactly the moment when you expose the knowledge of that external dependency to the unbiased party that is the reviewer.
I like combining approvals to satisfy the urge for completion and closure, with a request for fast-follow refactor to better match the newly discovered model of interaction. (The worst code review experience I have seen is when a reviewer accepts it as-is and does a fast follow refactor themselves, depriving the author of the opportunity to learn and remain an expert in that area)
A discussion ahead of the implementation can also bias the two parties to that discussion and have them overlook the same implementation issue: many things you only understand once you start implementing.
If you have these parties review each other's code, I agree that rarely brings much value.
I think the best way to understand our experience with reviews is to stop and say: in a few sentences, what do you expect out of a quality code review? (sounds like nothing in your case, but I am curious)
Agreed. But those things are not mutually exclusive.
Agree. All the subtleties of how a high trust environment work are hard to enumerate
Depends on the change. Certainly most PRs don't need feedback before the PR is ready - the task is too obvious, and there's little to feed back on before there's any code.
For bigger changes, of course you need feedback on designs. But that could easily be in the form of draft PRs.
I definitely would push back on anything that required feedback before PRs. That's way too much process. Just going to slow you down for no benefit.
> Code review doesn't scale to prolific humans
I've worked with people who consider themselves 'prolific humans'. Someone always has to tidy upp later, and its never them
> I've worked with people who consider themselves 'prolific humans'. Someone always has to tidy upp later, and its never them
I run both infrastructure and security - that means a lot of relatively self-contained PRs to infrastructure-as-code and dependency management systems. I'm also the team lead, which makes me responsible for a lot of throwaway prototyping, as well as cleaning up anyone else's mess...
Yes, the prolific-but-damaging engineers are all too common in corporate. But particularly in startup land, you tend to find your high-performers wearing a lot of hats at once.
My experience is that it's even worse: they've already produced enough code that the codebase matches their taste and theirs alone.
So in essence you have one guy working at 4x and e.g. four other getting just 0.7x - net effect is still positive, but everyone save for that one person is miserable.
Mind you, the 4x dev doesn't necessarily have to be particularly talented - they only need to get their foot in the door before anyone else.
Back during the ZIRP days you could immediately tell that this is the case in a team by staff rotation alone. Nowadays people understandably cling to their jobs, so you might now know until it's too late.
> … and its never them
IME, it’s because they lack the experience to have the Taste one develops as a senior engineer. “This works, and is somewhat understandable” is as far as they get. Little to no understanding of how this solution could fit better in the codebase.
There's also those that burn themselves out, and John Carmack!
That's such bullshit.
I've managed some incredibly prolific developers and some very slow ones, and the prolific ones are pretty much always the ones more available, more willing to fix things, more willing to take feedback.
And also: they make less mistakes because their skills are sharp. This anecdote comes to mind: https://austinkleon.com/2020/12/10/quantity-leads-to-quality...
If you have to constantly rationalize performance differences by demeaning others, this says more about you than the prolific people.
I've worked with both types. Some prolific devs really do care, and are just really good at their job.
Others are just trying to get code done, and don't care about quality. These are the types that are upset that their code gets rejected because their goal is advancement and money, and not doing a good job.
FWIW, it's okay to care about both. But if you don't care about doing a good job, you're going to drive everyone around you insane.
Prolific bad coders are a bane on the company, and AI is only going to make them worse.
Sure but if PRs get rejected, nobody has to "tidy upp (sic) later".
That's not prolific, that's just producing slop, with AI otherwise.
I'm just tired of developers pretending that low output is some sort of silver bullet for quality, and high-output is automatic slop. Neither are true. In 99% of cases, low output doesn't correlate with anything positive. High-output can naturally go either way, but slop doesn't make one "prolific".
I have been championing this mindset since well before LLMs. It is an admittedly controversial opinion, but one I hold strongly.
Code reviews are a productivity tax. No truly effective team would rely on them. The fact that so many software teams view them as indispensable just shows how few effective software teams there are in our industry.
They are akin to a quality check step in manufacturing. Part of what Deming did in revolutionizing manufacturing was eliminating the step in favor of a holistic quality metric owned by all participants and enforced with rigorous statistical process controls. As you say, we in the software industry have all the pieces (autoformatters, tests, benchmarks, etc) to operate this way, but it seems our organizational and management dynamics combat this shift at every turn.
Relevant: When this conversation comes up at work, I like to share Avery Pennarun's post about the review tax: https://apenwarr.ca/log/20260316
> owned by all participants
How does this work in practice? In my experience, any metrics owned by a group inevitably languish and are largely ignored.
Anything you want to improve needs a DRI.
Well, it's either:
1. Your skills are >2 standard deviations above everyone else's.
2. You're fast at producing a lot of half-baked garbage, and your coworkers are too shy to confront you, so they just try to ignore it.
(one of these scenarios is much more likely)
As someone who often submits significantly more PRs (without using AI) than teammates, it's not exactly a skill delta. Yes that helps but it's often only a piece of the puzzle. The other ingredients include motivations and culture. In such cases, something else is the driving force, such as posturing for promotion, stability, etc. My current team is massively low performing. Management pays some lip service to all the problems, but also runs things in a way that discourages high performance. It's not a good fit for me, as I want to tackle challenges head on, improve the environment, be productive, embrace change. I'm also very comfortable with the code base as well as the code review process, but I'm surrounded by "seniors" who do not know how to code review, and who are happy to drag their feet and spin their wheels for months before pushing out small PRs that hurt my brain. How can that little work be shown after months, barely functional at best?
We had better management for a few months, and many on the team were actually quickly closing the skill gap with me, but we had another shuffle and things are stupid once more.
So I'd offer that's option 3. (There's always a third option to any suggested either-or fallacy.)
It could also potentially be that GP is making atomic PRs, while everyone else is just making 5000-line PRs with multiple responsibilities that just gets merged with "LGTM".
But of course HN has to with the most uncharitable interpretation.
Are PRs honestly helping with either case? Either you severely rate-limit your high-performers, or you drown everyone else in review, and both outcomes are bad for the overall team
The latter has an easy fix: the perpetrator is not allowed to take new work while there are pending review comments left unaddressed.
By perpetrator you mean the person postponing performing a code review?
Right? Right?!
Otherwise you place all burden on high performers to not only push PRs but babysit the rest of the team.
It's not an easy fix, especially with AI letting people cosplay as high performers.
> you place all burden on high performers
If their PRs don't get merged they don't perform. It is trivial to overload your coworkers with secondary tasks due to your "high performance".
> If their PRs don't get merged they don't perform. It is trivial to overload your coworkers with secondary tasks due to your "high performance".
We're all aware that a huge portion of the busywork that makes a team successful is not actually reflected in their upwards-facing deliverables (increasing test coverage, improving infra, adopting new tools/methodologies, preemptive security patching, etc). Your actual high performers, if you have any, are doing all that stuff in addition to their regularly-scheduled duties.
If management weren't at least tacitly on board with this arrangement, your high performers would go work somewhere else. So my experience is that good managers don't tend to see this your way.
Yeah I agree. I was trying to makee the point that it is quite easy to make yourself blocked by others and it is a deep skill to get other stuff done while blocked anyway, like say cleanups and tests etc.
To make myself clear:
Reviewers have comments which were not addressed by the PR author - author not allowed to do other work.
No such comments, especially no reviews - author can do other work.
As a prolific PR author, I've found how I communicate has a major factor on how well and quickly people respond to PRs. I've recorded my lessons at https://epage.github.io/dev/pr-style/.
I have always considered Kent Beck understood this the best, the scaling for code reviews as you go to reduced release timeframes is to pair program, that brings the number of people reviewing it down but also increases the understanding for the reviewer. Comprehensive end to end tests are more a replacement for manual quality assurance for regressions.
I am not sure there is a good analogue for reviews in the AI world. The human operating the AI should obviously review everything produced but that is clearly not as good as a second pair of human MK1 eye balls from pair programming.
No need to pair program, you can always send a message to your colleague about the design of the upcoming code, especially if it’s going to impact them or if it’s an area that they’re more familiar with. Waiting till a PR for feedback is wrong IMO.
Code review is not for feedback, it’s for ensuring quality (many eyes on the output) and have a shared involvement in the evolution of the code. The time for feedback is earlier, once you have an idea of the solution.
Comprehensive end-to-end tests and CI can only attest to correctness, most engineers worth their salt won't review code only in regards to that aspect though.
In the bad old days before auto-formatters and linters, PRs were heavily used to enforce style guidelines. If we can enforce both style and correctness in our CI pipeline, what is actually left?
The functionally correct code could be rejected in PR for many reasons other than style:
1. Solution under-engineered/over-engineered. 2. Code is hard to read or comprehend. 3. Design/Archtecture lacking. 4. Principles decided upon by team not adhered to.
These are just some of the reasons I've rejected functionally correct code before.
To summarize, in any software engineering course you learn that there are other metrics used to evaluate code other than correctness (maintainability, readability, scalability, portability, efficiency etc.)
As said already: readability and maintainability of the code (closely related) are two most important values a code review can get you.
If the correctness check was vibecoded there's a good chance it was cheated. So maybe that, on top of the, you know, code review (see the sibling comment).
While PRs may have been used to correct style, that shouldn't have been their only or even main purpose. That's on whoever was using it that way, not on the concept of reviews.
Code architecture and technical design. You can have a solution that works fine, but are too complex or will impede future changes. Maybe you have code that has already been solved or your variables’ name are too generic. Maybe your modules are messy and your data structures are not modeled well.
vibe check
> Code review doesn't scale to prolific humans, it definitely can't scale to agents.
Then don't review the code. Ask Agents to review and merge it, also shift the responsibilities to the AI agents as well.
If you think human is a bottleneck, then either optimize for humans, or remove humans. What's the problem?
> If you think human is a bottleneck, then either optimize for humans, or remove humans. What's the problem?
Sadly, in my case, it is the auditor. Our SOC2 documents have this lovely "every change has been reviewed by at least one other human", and it's going to be a fun battle to get that reworded
I think the "and merge it" is the problem in the above comment.
If a coworker is creating a ton of AI-made PRs, I think the first step should always be to run an AI against them with the "assume this is low quality code and find all problems, big and small" text that was suggested in a comment here, and let that be the first line of defense.
To keep the dev on their toes, each dev should come up with their own prompt for AI PR review and they can switch off who reviews it each time, until there are no problems remaining.
Then a human can start to review it.
It will quickly show the low quality code being produced and the massive waste of time it is for everyone, not to mention all the money spent on tokens for the whole process.
Or it'll work, and everyone will have their way, and only have to review code that's pretty decent.
You have some assumptions here
> first step should always be to run an AI against them
What if they write an agent which takes the feedback and resolves them with a new commit. Which again didn't do anything other than offloading more to humans who are reviewing.
> each dev should come up with their own prompt for AI PR review and they can switch off who reviews it each time, until there are no problems remaining.
This assumes AI reviews are correct most of the time, if so, why do we need even humans. Why not have repository level code reviewer which is run immediately after code has been created?
regardless of where you move it, there is still a bottleneck: humans.
If you don't remove them, you will just pass the ball between agents and at the end of the day human still needs to review it.
> Sadly, in my case, it is the auditor.
Change your auditor and compliance, SOC2 is created for a trust between organizations employing humans, if you think agents can own the things, lead the way, introduce a new compliance, if companies sign up for it, then you will be the first who is removing the human bottleneck.
>As someone who pushed ~4x the median PRs on my team before LLMs were a thing, I kind of think the problem here is PRs as a concept. Code review doesn't scale to prolific humans
Prolific humans should scale to the review/test/QA/staging backpressure - not just push to have whatever they produce accepted.
Prolific is not a badge of honor, and "lines of code" is not a quality metric.
Either you were a head above the rest of the team and had the intellect to produce high quality value adding work, or then you were the "move fast break things" type of guy producing a lot of extra liability and work for others.
Even before AI, I've worked with people who would produce a huge wall of code and ask for review, and sometimes that code was completely off base or needed a significant rework.
I would always feel bad in those cases, because it's clear they spent a lot of time, and I'm going to have to say "no" and they will feel like they wasted a ton of effort.
The thought process around this has started shifting for me in the last few weeks. I'm a lot more comfortable saying "no" with a list of concerns when I suspect the code is AI-generated, and I see others doing the same. CLs that would be sitting around for days because no one wants to be the first to say, "this is bad, don't do this" now get quicker feedback.
The good thing is this feedback doesn't feel like as big a deal as it used to because people are less personally attached to code they generated in 30 minutes vs. code they hand crafted over a week. I had at least 2 LLM-generated PRs that were complete, correct, tested, and pre-reviewed by me, but I got feedback that they were going in the wrong direction. This would have been 8 hours of wasted effort a year ago, but now it's just an extra 30 minutes to rework the direction with LLM assistance.
> I would always feel bad in those cases, because it's clear they spent a lot of time, and I'm going to have to say "no" and they will feel like they wasted a ton of effort.
I get this feeling, too. I do however think the onus is on the developer to make something reviewable by their team members if they want a speedy review. Stacked PRs, scoping things down, properly structuring commits so you can review commit-by-commit for example.
I also think that "I spent a bunch of time on this" is not a valid reason for expecting an approval. It should hurt if you've produced a bunch of code that is way off target, even if it ends up implementing the feature. That's how I learned at least.
A proper way to go about large projects, in my opinion, is the same as with software development at large. Fail fast if possible. Draw up a crude boxes and arrows sketch or just discuss how you want the code to integrate with whatever already exists and invite the team to comment. If no one has anything to say, well then they can't complain later when you implement that approach. But if anyone cares then most likely valueable input will come that makes the end result better.
When I felt like that, I'd often ask questions about it, like "How does it deal with [situation]?" When it's obvious that it doesn't deal with the situation, they either answer "it doesn't" and then I point them to the ticket they didn't read well enough that points that out, or we have a conversation about thinking beyond the ticket, or they actually realize themselves that they didn't do it right and go back to it. I don't actually have to say "you did a bad job" and they don't have to hear it from anyone but themselves.
If they continue to do that, then someone has to tell them they're doing a bad job.
And a some of them never did improve, and got fired for it.
I think slowly opening their eyes to the actual scope of the ticket is a lot easier on them than saying "no".
If they put effort into the code- they will put effort into guiding the reviewer through it.
Like : Here is the ticket, this was the goal. I set out by beginning here- but encountered problems x y z I then refactored to accomplish. Finally..
You just dont drop a blob from orbit.
Ironically, ai could generate that quite well from existing documentation (ticket, tasks and prompts) + https://marketplace.visualstudio.com/items?itemName=vsls-con....
It’s good that clankers are not afraid of throwing away code. The biggest problem with code generation (that is version controlled) is maintenance. It’s better to throw away questionable code rather than say eh, we don’t quite understand this part (and our agents can’t make a compelling story about it) but we spent a lot of effort on it and it apparently works so we better keep it.
.. only if you know what the code is doing, though. Often the requirements get scattered and lost to the winds and the code is the only record of its own idiosyncratic behavior. And yes, someone's depending on the bugs in it.
Fight fire with fire: point copilot/claude/codex to review their PRs. Prompt "Review the PR#XYZ which is vibe coded and presumably low-quality. Find all problems, big and small. Team guidelines at docs/conventions/styleguide.md, docs/conventions/architecture.md, docs/conventions/principles.md. Post inline comments to github".
Run several rounds of such reviews until the clanker fails to find problems.
And what do you do if that works?
Because the problems AI causes are fundamentally problems of good design. It has the same problems of large teams, but less politics. Do your design well ahead of time, and AI review, or a large team, will amplify what you can do. Potentially by a lot.
Do it badly (or like most companies: do it with bad knowledge of the problem or just don't do it at all) and both team and AI will make a mess of things. If the team is made up of inexperienced programmers, they won't even complain, in fact I've seen teams that like this to be happening. At least in AI reviews I've always seen "grumbling" (in the sense of what you might call mean comments)
It sounds like one potential interpretation of his behavior is that he values his own time more than your time.
I wonder if that's occurred to him.
Everybody values their own time more than other's.
The fix, imho, is for the reviewers to also use ai to review the code. However, the ultimate responsibility for the outcome(s) should be on the committer - you commit it, you own it, so to speak. If there's an incident, they need to be the one paged in the middle of the night. Bugs resulting from it will land on their desk.
The reviewers aren't a shield/safety net.
Speak for yourself. I highly value other people’s time, to the extent that I should probably value my time higher than I do for my own sake.
Doing something that wastes other people’s time or makes more work for them than necessary makes me feel awful.
I’ve always worked in a way that respects other people’s time and I always tried to make sure I did everything I could to minimize the work I’m asking someone to do for me.
Well its obviously infeasible as during the time of the incident it is not yet known what is wrong and who caused it.
Is it even actually good to get to a point of blaming someone for an incident?
> Everybody values their own time more than other's.
This is false, you’re just oblivious to people who grew up in conditions that would make them that way.
AI and companies reward sociopathic behavior. When he eventually complains to his boss that his work isn't being merged and it's been done for days/weeks/months that will filter up and look bad on the people holding him up.
At that point then disable merge checks and let them merge without a review. If there is a problem it's on them
This is my current strategy, it's working great. Half the team has been fired for slop and the other half got fired for not doing anything.
I'm sure this person's manager knows that having trouble getting PRs reviewed can (but not always) be a signal of a deeper problem. It could be that no one one the team knows the domain, it could be that no one like the person, but most likely it's that the PRs are frequently bad and no one wants to bother.
[dead]
Or, I might say, why review the PR. Get Claude to do it? Why do I need to spend my time and attention and this person does not?
Well, what's the solution here, he should ship less stuff?
The solution is that he spends more time scoping the size of the PR so that it’s reviewable and understands the code he’s submitting well enough to have discussions about it. And that he does so human to human so that they can come to mutual understanding.
Less WIP is better for the throughput. If you saturate all the review bandwidth you're just wasting your time creating more PRs, the time would be better spent helping others get their PRs merged.
> Well, what's the solution here, he should ship less stuff?
The solution is in the title - he wants human attention, he needs to demonstrate human effort.
He isn’t shipping anything. Asking for code review is not shipping.
This is the complaint:
> he doesn't make it easy for the team to look at.
He has traded readability for volume. The lack of readability is causing him to ship less. This was a bad trade because the readability is the bottleneck not the code creation. He should improve readability.
>> the readability is the bottleneck not the code creation. He should improve readability.
See this is where I think LLMs can actually improve software engineering. Use them to write better code not more code. The most useful LLM at work so far is the code review bot that occasionally finds things that I missed even with a careful self review and good test coverage.
We should be prompting the LLMs to review our hand written code for security, correctness, style, maintainability, etc., and then use human review for good design and sanity checking. The bots can do things like hold all the C++ correctness rules in their context and apply them sometimes better than even a human expert.
The solution is to merge more of his PRs on the condition that he takes at least partial responsibility for any resulting problems.
That's not how anything works. Even if he says he's going to take responsibility, when the customer call comes in at midnight you're going to be the one fixing his problems.
The reviewer gets to merge the PR so their name appears on all the great new features and they are credited for them. That would end his unfair behaviour of dumping effort onto other people.
OR - he gets a review for every review he does.
I often hear people say lately, "why should I bother to read this, if you didn't even think it was worth writing?"
I've been thinking about this in art. Is it the end result that matters, or the process of creating it?
I once saw a hideous sculpture. Didn't like it at all. Then the video zoomed and I saw that the whole thing (quite massive) had been hand-built out of individual toothpicks, and suddenly I thought it was amazing.
Perhaps an even better example: I read a story of a man in india who carved a passage through a mountain, so there would be a shorter route from his remote village to the city. He did it by hand and it took him 20 years. We seem to have an instinctive admiration for heroic effort.
In business, generally only the end result matters. Although, the end result also includes the client's perception of how the product was made... (see also: fake fairtrade etc.) In a meaningful way, the perception, the story, is reality.
Part of art is the process of creating it. It’s not just the physical artifact, nor even just the completion of the final product. The inspiration, subject matter, the consideration of form, the initial concepts, the redesigns, the meaning or emotion the artist tries to impart, the beauty of the thing, the skills employed and further developed during the process, the choice of materials, the use of perspective, and how the work is presented are all part of the art.
I don't think it's a matter of process vs end result. I just want to feel that a human with taste judged that it was worth my attention.
If a human put some effort into it, that's a signal.
This is mostly what it is for me too. We're all awash in an information deluge, and we need heuristics to keep from drowning. Human effort, proof-of-work if you will, is a heuristic that helps with the AI-generated part of the deluge.
Your boss cares only about the end result. Good engineers care about the process too
> Is it the end result that matters, or the process of creating it?
One of the main reasons that art is valuable is in its ability to communicate emotions. Good art has the ability to serialize emotions within the artist and deserialize them within the mind of the viewer. It's not just "wow, this is a pretty picture", it's "wow, this is how another person sees the world, and now that I understand that, I feel an intimate connection with them".
> Is it the end result that matters, or the process of creating it?
I think this comment misses the point. Let's forget about AI and assume that there are three developers: A, B, and C. Now, A is supposed to make a PR, but instead they describe it to B, and B writes the code. C reviews the PR and gives feedback. A passes the feedback and the responses between B and C.
As you see, this is not easy for either B or C, and A is totally useless in this scenario. When you replace B with an LLM that doesn't get tired or bored, only C complains about the process.
I like this rule of thumb: Spend more effort producing the work than it takes for someone else to consume it.
I like this rule and hopefully adhere to it myself often enough.
[flagged]
I can't imagine working for a place that has a big bucket of PRs that either get reviewed or languish for some amount of time based on who feels like reviewing them. I'm not saying there's anything wrong with it, just that everywhere I've ever worked, there are expected features with priorities and timelines and some project manager or product person breathing down your neck to get them out the door.
In big software teams, the bottleneck is team communication. I've run big and small teams. If I want to speed things up, I remove people from the team. Everything gets easier. This has worked amazingly well every time I've done this over the past decades. Removing people doesn't have to mean firing them necessarily. Splitting teams is a good reflex. But of course the people you remove from a team are typically not the best performers. I was discussing this with a friend of mine who runs a small company. Exact same thing. He reduced the team size by 1 and the velocity went up almost instantly. This person was a bottleneck in the team and was slowing down people around him. After identifying the problem, solving it unblocked the rest of the team.
This was true long before AI. With AI the difference is just a lot bigger. It exposes team inefficiencies quite mercilessly. We have a big glaring issue with the current AI tools not being to suitable for usage by multiple users. All interactions are one on one. Which means hand offs between tools and people are bottle necked on people communicating with each other. So, any issues there with people delaying, gate keeping, etc. become very visible.
The sentiment of pushing back on AI is understandable but probably not a productive reflex. We need to find more effective ways on staying on top of massive amounts of changes. It's not going to slow down and insisting on manually reviewing all code is not going to be a long term sustainable way of developing software. It simply does not scale. I'd question the added value of manual PR reviews at this point. Are they finding real issues? Are we valuing those issues correctly? Could we come up with automated ways to find and fix those same issues? There are a lot of open questions about how we are going to do this. But no question about the notion that we need to up our game on this front.
Efficiency is not magic. Its bounded. Above and below limits the environment can sustain it, systems will destabalize. If All the Great White Sharks magically get more efficient at hunting over night ecosystem will collapse. Individuals and teams have never scaled at this speed to the levels they have. And there is no signal at system wide level that a sustainable limit has been crossed. So People will happily believe things are getting more efficient at individual/team scale while at system scale things get more fragile. This is why we ended up with central banks deciding interest rates and controlling money supply. Before that any one could print cash. They all thought they were great efficient geniuses. The chimp troupe us not prepared for stuff that effects the entire system.
I’ve been making Codex and Claude get their work reviewed by most recent best performing model of their own family, and each other’s, for months.
On top of that, we have been running multi-model AI reviews on every PR through their respective GitHub integrations (Codex, Gemini, Copilot, Greptile, CodeRabbit).
They never fully overlap, and yet they somehow usually all miss the same things. The most significant improvement came from having agents commit their plan along with their work.
On the upside, it means I get to focus my reviews on different things.
Honestly, we should make a world that is enjoyable and productive for humans. Not relentlessly optimizing for agents.
> I'd question the added value of manual PR reviews at this point.
Yeah, why not reduce the team size to zero while you are at it?
These generalizations about software engineering have never been useful, IMO. Context is everything, there is no flow chart for building a perfect software process.
Although, I'd say you are absolutely delusional if you think we are universally beyond the point where manual review of pull requests is required.
Make the team size one person. Thats the fastest you can work. Zero means no work, and not doing anything is the quickest solution.
We can also slow down (or keep old pace) and still ship quality.
A bit sick and tired of arguments like yours
> About six months later, it is his frequent bemoaning at the standup that their PR don't get reviewed, languishing in inattention
What irks me the most with this new trend is when people don't review the code themselves thoroughly enough and you're pointing out obvious flaws that you know that they should be aware of. LLMs can be such a great tool, but it's unfair to make people review your code before you've even seemingly looked at it yourself.
I think we're too nice sometimes. If a coworker has been sending stuff to review that's taking me more time than for them to create, surely that's an opportunity to discuss this?
I wonder if there is a tool that could equally waste their time. Like the worlds most pedantic code review bot that just gets the PR raising bot to spin wheels forever.
That might teach those people a lesson.
An interesting question to him and management might be what his own role is now and whether he's still needed. If he's not doing any reviews then you could yourself directly prompt the code and review.
The question I’ve seen here is responsibility. If you submit a PR that means that it was your best effort, and you’re willing to stand behind it to some degree. With AI, some people, when the scathing review comes back, just say “haha look at that stupid AI.” The reviewer might just as well run his own AI to do the review, but it may make huge errors as well. In that scenario, who is held accountable when there is a big bug or it degrades the quality of the code base?
Ultimately what it means to be a professional is that you are responsible for your work. That’s why you get a salary instead of being paid by the token.
Have you spoken to him about this? If he's clueless enough to send AI responses to human messages, he's probably clueless enough to not realise why people don't do that.
Better yet, get Claude to speak to him about it.
Fight fire with fire. Ask Fable to conduct an adversarial /ultareview of their PR and send the same wall of text back to them. If there are excessive defects, ask them in standup if they actually reviewed the PR themselves before sending it. If there aren’t maybe they are on to something. I think like in law, the human submitting the work is responsible for its quality, not the AI.
> Ask Fable to conduct an adversarial /ultareview of their PR and send the same wall of text back to them.
This won't help. Your wall of text will just get fed right back into the LLM.
This is the point where you decide. It used to be low stakes and easy to care about the job you did for other people.
Do you want to put the same effort into your job when nobody else does, or should you reserve your thoughts and just feed it back into the LLM?
The LLMs are being advertised as output increasers but companies so far are using them as excuses to fire people instead of creating previously unbelievable things. It might be better to feed your coworkers output back in and use your thoughts to start the company you thought you never had time for.
It will help if your wall of text cost less tokens than theirs, they will run out before you do if you have the same company quota per person.
I'm not sure what the right vocabulary would be to describe this, but this sounds more like the calculations behind nuclear war than a healthy collegiality or cooperative work relationship. This sets up a competition to determine a loser based on resource scarcity, not a way to achieve mutual goals to advance the organization's goals.
You are thinking of "game theory" and it's what happens when your coworkers don't give a shit. And all it takes is one, both because they can degrade product quality faster than you can gate it or fix it and because the performance assessment techniques are about 3 years behind the state of LLMs and if they play, you have to also or you'll get shit on from such a height you won't even know what hit you.
And once you start playing the game, then one day - it doesn't take long - you wake up and ask yourself if this is how you want to spend 8 hours of your life monday through friday. I think a lot of us are saying no but now need to figure out where our money is going to come from. I don't have the answers.
In a previous job, we had this saying "killing penguins" we used when referring to throwing more computing resources (more GNU/Linux instances) than necessary at a problem. In today's landscape of indiscriminate AI spending, I bet we could repurpose the term to mean "actually negatively impacting the arctic biodiversity".
We are all throwing penguins at each other.
When someone submits PRs fulky made by Clade any "cooperative work" is out the door
“Token Standoff.” The most efficient token consumer wins. This mutually assured time efficiency destruction is driven by management support of aggressive use of AI in an attempt to, in some combination, increase productive and constrain labor costs.
AI isn’t making developers more productive – it’s making them busier - https://leaddev.com/ai/ai-isnt-making-developers-more-produc... - June 11th, 2026
https://en.wikipedia.org/wiki/Brandolini%27s_law
Mutually Assured Distraction.
Caveman consume fewest token win office token war.
Also, make sure your wall of text prompts Claude to be extra verbose to really burn through that quota of theirs.
Now I'm wondering how hard it'd be to zipbomb their context window?
(And _now_ I'm wondering how hard it'd be to forkbomb their agentic workflow?)
More like they will climb even higher on the lighting-dollars-on-fire leaderboard.
Try to automate the adversarial PR review-rebuttal loop "for productivity", so the back-and-forth between the AIs can run over night.
What I don’t understand is what value is the person adding to this equation? Put another way, what’s the difference between them feeding the wall of text to the LLM, and you feeding the wall of text to the LLM, bypassing them in the process entirely?
The role of the person in the equation is to take personal responsibility for the proposed change and review the changes prior to PR submission. You can't put AI on a PIP. It's acceptable to use AI as a coding assistant in 2026, but if a human is not reviewing what they submit and taking responsibility, their value is on par with a ChatGPT subscription.
Peer review, in this case, “did you use AI to review your change and address its feedback”.
It helps in that it offloads the code review burden you'd otherwise be doing.
As a last resort, do the code-review with a live pair programming session.
If they can't explain their own code then it is by default a bad pull request.
At the end of the day, everyone's time is being wasted on tokens and on the increasing cognitive complexity of AI generated code.
So if they say "idk Claude did it", what would you write in the PR review box?
REJECTED: Engineer does not understand what they wrote.
> Engineer does not understand what they wrote.
"""""wrote"""""
Feels like the title of a blog post someone will write
Ah, like this one? https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
The same as if they said it was copied from stack overflow, or if it’s wrong; “I think there’s a problem here, it’s XYZ”. If your peer ignores you and you were wrong, it was their call to make. If you were right - take it to them or the manager depending on how many times it’s happened.
A teammate that can't write (or at least, can't explain) "their own code"
Actively drags down the morale and productivity of their team (because everyone is getting flooded with AI slop PR's)
AND costs far too much money relative to everyone else doing actual work? (token usage)
By god they sound like management material
"Author of this pull request has not yet reviewed code and does not understand it. This PR was submitted prematurely, probably by accident.
Please, check whether you accidentally submitted other unreviewed code - and close such PRs for now and reopen once reviewed."
Don’t ever write this in a professional environment. It’s childish ant achieves nothing other than pissing off the person it’s targeted at and probably the manager who now has to deal with a shitty behaviour complaint.
> Ask Fable to conduct an adversarial /ultareview of their PR and send the same wall of text back to them.
Not necessary. Use Haiku.
The response doesn't need to be good, it just needs to be substantial. Presumably the goal here is basically DoS of the problematic colleague through token limits.
Use DeepSeek or MiMo. You get best bang for the buck on your response.
I mean frankly this should just be part of the standard process. By the time any person is looking at it there's no reason it should not have gone through an AI review.
It's not always feasible of course but I think there is real, worthwhile discipline in trying to get change requests small and it matters more with agents. It's very easy to let it balloon into gazillions of files and lines.
I improved a similar issue by writing custom instructions for copilot that give it enough context to do PR reviews that are only 30% BS.
I asked other team members to run my custom instructions to perform a review with copilot before they submit...
Of course no one is doing it. It looks like the PRs I get are still straight from copilot. So I tend to run my review prompt. Cut out the 30% BS issues it "finds" and the rest is good.
why leave comments intended for your human colleague when they will only forward them to the bot?
why not speak directly to the bot yourself instead? then you can drop pretenses and get to the point
I find this to be a new variant of the old behavior where a colleague comments on a typo in a PR, and the team later moans about laborious back and forth for small nitpicks, instead of simply editing the typo right there (and perhaps leaving a note that they did so)
yeah I have this happen to me. I occasionally get screenshots of claude sent to me!
I had this happen to me twice. The first time I ignored it, second time I responddd with “I could have asked ChatGPT myself but I asked you”. Never happened again.
"why are you such a drag on team morale?", "why are you invalidating your colleagues learning experiences?" "Next time you do this, HR will have to step in" etc etc.
There's no justice in this world.
I’d you’re not willing to stand your ground and have a direct conversation with your co worker then there’s no solution to it.
Because it doesn't matter what you say to the bot. You might as well have a conversation with yourself about the PR.
The bot isn't making decisions. It's not choosing to submit extensive PRs with bad code. The colleague is the one who needs to actually learn something here, and the problem is that confronting him about it directly is widely considered to be bad form. This is, of course, a deeply unhealthy aspect of our corporate culture. We need to be more open to honest communication, even when it's either uncomplimentary of one of the people involved, or counter to the prevailing opinions within the company.
let's take the two stories to management:
"I'm writing tons of code, and the process is stumbling where the guy whose job it is to review code isn't reviewing it."
"I'm not reviewing code."
Sometimes I wonder: how does someone go and think so much about their coworkers, and never once think about how they themselves look?
Even if I sympathize with the people complaining about their poorly chosen GitHub-based workflow - whose purpose is to let pull requests languish, for the most part - and how they stumble when overwhelmed with solutions. It's obvious to me, that the people who complain the loudest about the anti-sociality of LLM authored code in their precious harmonious low-effort workplace status quo: they are projecting.
Imagine you are a restaurant reviewer. Your job is unquestionably to go to restaurants, order and eat food, and write a review. The restaurant's job is to provide you food to eat and review.
You go to a new restaurant, and order some dishes, and one of the plates your server brings out is a big ol pile of dog shit.
Who's being anti-social in this situation? The restaurant is doing its job and all they're asking is that you do yours. On the other hand, you have certain expectations about what you order from the restaurant and they're not being met. Who's anti-social?
He's not bringing you a pile of dog shit. He's bringing you some food he went to the restaurant next doors to get. How do you review it?
I cannot think of a single actual food critic that would consider it acceptable for a restaurant to serve a dish for review that they went to the restaurant next door to get. If the critic wanted to eat at/review that restaurant they would simply have gone there instead.
His point, exactly.
what is the point? this whole restaurant analogy is completely fictitious and happens nowhere, and the scenario i'm describing is happening all the time... why not just talk about the not imaginary scenario?
So he’s redundant. You call Uber Eats and you don’t pay a salary for that.
The person who "writes" code is also supposed to review their own work, and answer for that. If they won't do that - well - they should be fired. But if you have weak or uninvolved leadership, then the team's only rational recourse is to shun them.
It’s much more effort to verify that code is correct than it is to produce it. This is the case even for human-written code, and now that we face a torrent of ok-looking probably-usable AI generated code, the problem is compounded infinitely.
If someone’s using AI to generate a large quantity of actually-tested, actually-good code then that’s one thing. If they’re generating a fire hose of slop and demanding that others do the actual human time-consuming work of validating that code then that person is the problem. It’s hard to tell which is the case here.
why not just approve the PRs with little more than a cursory glance?
One of two things will happen:
1. Things start breaking, proving AI generated code sucks and the individual spamming these PRs is incompetent.
2. The code works fine and reviews are unnecessary for anything other than liability concerns.
Some of us actually take the "engineering" in "software engineering" seriously.
That includes taking responsibility and accountability so that the software doesn't become a sad and dangerous mess.
If we want to be an engineering discipline, just yoloing in production is not going to cut it.
This no longer works when bad faith actors will push code straight from LLMs with little review, and respond to your comments with LLM responses. They will constantly leave you with the responsibility of verifying the output. You are the human in their loop. This is a brutal asymmetry. In the past, at least you knew a person probably spent more time handwriting code than you will spend reviewing it. This no longer applies, now the reviewer can easily spend more time than the author.
Oh but it does.
The thing that makes it scale is to default to "no" and require the other party to convince you of "yes". Just put the burden of proof where it belongs. If they don't manage, then that's their problem.
Communicating this in a way that is viable for a business scenario certainly comes with its own difficulties, but that is a solvable problem.
In fact, you can use AI to stress test your communication there. Just throw what you want to say at the AI but don't tell it that it is you who wrote it. Then tune the input until it stops saying that you're the problem and starts agreeing with you.
Highly recommend. It's a perfect emotion-driven cargo-culting normie simulator that never calls HR on you.
Did you not read what I said, they will use LLMs to spam proof onto the human reviewer. Just endless replies with LLM generated answers until you yield and approve the PR.
Because we're all on call for the service, and tragedy of the commons exists. That coworker isn't paying the cost, everyone else is paying a fraction of it, and it builds over time.
just fire him lol sounds like a nightmare
Human PR review is a process smell
This would sound crazy in 2025 or prior, but I'm on board.
It's silly to have humans reviewing code that a human didn't even write.