It's similarly insulting to read your AI-generated pull request. If I see another "dart-on-target" emoji...

You're telling me I need to use 100% of my brain, reasoning power, and time to go over your code, but you didn't feel the need to hold yourself to the same standard?

Why have the LLMs „learned“ to write PRs (and other stuff) this way? This style was definitely not mainstream on Github (or Reddit) pre-LLMs, was it?

It’s strange how AI style is so easy to spot. If LLMs just follow the style that they encountered most frequently during training, wouldn’t that mean that their style would be especially hard to spot?

For this "LLM style were already the most popular, that's how LLM works, then how come LLM style is so weird and annoying" I have 2 theories.

First, LLM style did not even exist, it's a match of several different styles, choice of words and phrases.

Second, LLM has turned a slight plurality into a 100% exclusivity.

Say, there are 20 different choices to say the same thing. They are more or less evenly distributed, one of them is a slightly more common. LLM chooses the most common one. This means that

   situation before : 20 options,  5% frequency each
   situation now    :  1 option, 100% frequency
LLM text is both reducing the variety and increases the absolute frequency drastically.

I think these 2 theories explain how can LLM both sound bad, and "be the most common stye, how humans have always talked" (it isn't).

Also, if the second theory is true, that is, LLM style is not very frequent among humans, that means that if you see someone on the internet that talks like an LLM, he probably is one.

I understand there is an "Exclude Top Choices" algorithm which helps combat this sort of thing.

This is total speculation, but my guess is that human reviewers of AI-written text (whether code or natural language) are more likely to think that the text with emoji check marks, or dart-targets, or whatever, are correct. (My understanding is that many of these models are fine-tuned using humans who manually review their outputs.) In other words, LLMs were inadvertently trained to seem correct, and a little message that says "Boom! Task complete! How else may I help?" subconsciously leads you to think it's correct.

My guess is they were trained on other text from other contexts (e.g. ones where people actually use emojis naturally) and it transferred into the PR context, somehow.

Or someone made a call that emoji-infested text is "friendlier" and tuned the model to be "friendlier."

Maybe the humans in the loop were all MBAs who believe documents and powerpoint slides look more professional when you use graphical bullet points.

(I once got that feedback from someone in management when writing a proposal...)

I suspect that this happens to be desired by the segment most enamored with LLMs today, and the two are co-evolving. I’ve seen discussions about how LM arena benchmarks might be nudging models in this direction.

AI sounds weird because most of the human reviewers are ESL.

You may thank millenial hipsters who used think emojis are cute and proliferation of little javascript libraries authored by them on your friendly neighborhood githubs.

Later the cutest of the emojis paved their way into templates used by bots and tools, and it exploded like colorful vomit confetti all over the internets.

When I see this emojiful text, my first association is not with an LLM, but with a lumberjack-bearded hipster wearing thick-framed fake glasses and tight garish clothes, rolling on a segway or an equivalent machine while sipping a soy latte.

Everyone in this thread is now dumber for having read this comment. I award you no points and may god have mercy on your soul.

Jokes on GP, I give up reading most comments when I don't like them anymore, usually after 1-2 sentences.

I love how these elaborate stereotypes reveal more about the author than the group of people they are lampooning.

Welcome to the bottom, it's warm and cozy down here.

This generic comment reads like its AI generated, ironically

It’s below me to use LLMs to comment on HN.

Exactly what an LLM would say.

Jk, your comments don't seem at all to me like AI. I don't see how that could even be suggested

[flagged]

Beard: check

Glasses: check (I'm old)

Garish clothes: check

Segway: nope

So there's a 75% chance I am a Millenial hipster. Soy latte: sounds kinda nice

LLMs write things in a certain style because that's how the base models are fine tuned before being given to the public.

It's not because they can't write PRs indistinguishable from humans, or can't write code without Emojis. It's because they don't want to freak out the general public so they have essentially poisoned the models to stave off regulation a little bit longer.

I doubt this. I've done AI annotation work on the big models. Part of my job was comparing two model outputs and rating which is better, and using detailed criteria to explain why it's better. The HF part.

That's a lot of expensive work they're doing, and ignoring, if they're just later poisoning the models!

GP kind of implying that AGI is already there, and all companies are just dumbing them down because of regulations of the law.

I'm like "Sure buddy, sure. And the nanobots are in all vaccines, right?"

this is WILD speculation without a citation. it would be a fascinating comment if you had one! but without? sounds like bullshit to me...

It is wildly speculative, but it's something I've never considered. If I were making a brave new technology that I knew had power for unprecedented evil, I might gimp it, too.

This sounds like the most plausible explanation to me. Occam's razor, remember it!

My impression is that this style started with apple products. I remember distinctly opening a terminal and many command lines (mostly Javascript frameworks) applications were showing emoji in the terminal way before LLMs.

But maybe it originated somewhere else.. In Javascript libraries..?

I thought it was JavaScript libraries written by people obsessed with the word "awesome", and separately the broader inclusivity movement. For some reason, I think people think riddling a README with emoji makes the document more inclusive.

> For some reason, I think people think riddling a README with emoji makes the document more inclusive.

Why do you think that? I try to stay involved in accessibility community (if that's what you mean by inclusive?) and I've not heard anyone advocate for emojis over text?

It's really only anecdotal — I observed this as a popular meme between ~2015-2020.

I say "meme" because I believe this is how the information spreads — I think people in that particular clique suggest it to each other and it becomes a form of in-group signalling rather than an earnest attempt to improve the accessibility of information.

I'm wary now of straying into argumentum ad ignorantiam territory, but I think my observation is consistent with yours insofar as the "inclusivity" community I'm referring to doesn't have much overlap with the accessibility community; the latter being more an applied science project, and the former being more about humanities and social theory.

Could you give an example of the inclusivity community? I'm not sure I understand.

I mean the diversity and inclusion world — people focused on social equity and representation rather than technical usability. Their work is more rooted in social theory and ethics than in empirical research.

I do remember 1 example of an emoji in tech docs before all of this -- learning github actions (which based on my blog happened in 2021 for me, before ChatGPT release), at one point they had an apple emoji at the final stage saying "done". (I am sure there are others, I just do not remember them.)

But agree excessive emoji's, tables of things, and just being overly verbose are tells for me anymore.

I do recall emoji use getting more popular in docs and – brrh – in the outputs of CLI programs already before LLMs. I’m pretty sure thst the trend originated from the JS ecosystem.

It absolutely was a trend right before LLM training started — but no way this was already the style of the majority of all tech docs and PRs ever.

The „average“ style, from the Unix manpages from the 1960s through the Linux Documentation Project all the way to the latest super-hip JavaScript isEven emoji vomit README must still have been relatively tame I assume.

Really hate this trend/style. Sucks that it's ossified into many AIs. Always makes me think of young preteens who just started texting/DMing. Grow up!

I wonder if there's an analogy to the style of Nigerian e-mail scams, that always contain spelling errors, and conclude with "God Bless." If the writing looks too literate, people might actually read and critique it.

God Bless.

RLHF and system prompt, I assume. But isn't being able to identify LLM output a good thing?

I wonder if it's due to emojis being able to express a large amount of infomation per token. For instance, the bulls-eye emoji is 16 bits. Also, Emoji's don't have the language barrier.

There's some research that shows that LLMs finetuned to write malicious code (with security vulnerabilities) also becomes more malicious (including claiming that Hitler is a role model).

So it's entirely possible that training in one area (eg: Reddit discourse) might influence other areas (such as PRs)

https://arxiv.org/html/2502.17424v1

It reminds me of this, but without the logic and structure: https://gitmoji.dev/

Don't Github have emoji reactions? I would assume that those tie "PR" and "needs emojis" closely together.

I'm glad that AI slop is detectable. So, for now the repulsive emoji crap is a useful heuristic to me that someone is wasting my time. In a few years once it is harder to detect I expect I'm going to have a harder and more frustrating time. For this reason I hope people don't start altering their prompts to make them harder to detect as LLM generated to people with a modicum of intelligence left.

> Why have the LLMs „learned“ to write PRs (and other stuff) this way?

They didn't learn how to write PRs. They "learned" how to write text.

Just like generic images coming out of OpenAI have the same style and yellow tint, so does text. It averages down to a basic tiktok/threads/whatever comment.

Plus whatever bias training sets and methodology introduced

That’s my whole point: Why does it seemingly „average down“ to a style that was not encountered „on average“ at the time that LLM training started?

[deleted]
[deleted]

100%. My team started using graphite.dev, which provides AI generated PR descriptions that are so bloated with useless content that I've learned to just ignore them. The issue is they are doing a kind of reverse inference from the code changes to a human-readable description, which doesn't actually capture the intent behind the changes.

I tell my team that the diff already perfectly describes what changed. The commits and PR are to convey WHY and in what context and what we learned (or should look out for). Putting the "what" in the thing meant for the "why" is using the tools incorrectly.

Yes, that’s the hard thing about having a “what changed” section in the PR template. I agree with you, but generally put a very condensed summary of what changed to fulfill the PR template expectations. Not the worst compromise

My template:

1. What is this change supposed to do?

2. Why is this change needed?

3. How was it tested?

4. Is there anything else reviewers should know?

5. Link to issue:

There's no "What changed?" because that's the diff. Explain your intent, why you think it's a good idea, how you know you accomplished your intent, and any future work needed or other concerns noticed while making the change. PR descriptions suffer from the same problem as code comments by beginners: they often just describe the "what" when that's obvious from the code, when the "why" is what's needed. So try very hard to avoid doing that.

It's same same issue we had 20 years ago with javadoc. Write what you want to do, not how you do it.

i++; // increment i (by 1)

My PR templates are: - what CONCEPTUALLY changed here and why - a checklist that asserts the author did in fact run their code and the tests and the migrations and other babysitting rules written in blood - explicit lists of database migrations or other changes - explicit lists of cross dependencies - images or video of the change actually working as intended (also patronizing but also because of too many painful failures without it)

Generally small startups after initial pmf. I have no idea how to run a big company and pre pmf Im guilty of "all cowboy, all the time" - YMMV

Does the PR description not end up in the commit history after merge? A description of what changed is very useful when browsing through git logs.

> A description of what changed is very useful when browsing through git logs.

Doing a blame on a file, or just looking at the diff of the pull request gives you that. The why is lost very fast. After a few months it is possible that the people that did the change is not anymore in the company, so nobody to ask why something was done.

"Oh, they changed the algorithm to generate random numbers". I can see that in the code. "Why was it changed?". I have not clue if there is no extra information somewhere else like a change log, pull request description, or in the commit comments.

But all this depends on the company and size of the project. In your situation may be different.

Not just browsing, but also searching.

The PR spec for some open source projects are quite onerous.

What is unspoken here is that some open projects are using cost of submission AND cost of change / contrib as a kind of means of keeping review work down.

Nobody is correct here really. It's just that the bottlenecks have changed and we need to rethink everything.

Changing something small on a very large project is a good test. A user might simply want a new optional argument or something. Now they can do it and PR. But the process is geared towards people who know the project better even if the contributor can run all the tests it is still not trivial to fill in the PR request for a trivial change.

We need to rethink this regime shift a bit.

you mean we will get even more of these sort of useless comments?

  // loop over list and act on items
  for each _, item := range items {
    item.act()
  }

I would never put up a copilot PR for colleague review without fully reviewing it myself first. But once that’s done, why not?

It destroys the value of code review and wastes the reviewers time.

Code review is one of the places where experience is transferred. It is disheartening to leave thoughtful comments and have them met with "I duno. I just had [AI] do it."

If all you do is 'review' the output of your prompting before cutting a CR, I'd prefer you just send the prompt.

> Code review is one of the places where experience is transferred.

Almost nobody uses it for that today, unfortunately, and code reviews in both directions are probably where the vast majority of learning software development comes from. I learned nearly zilch in my first 5 years as a software dev at crappy startups, then I learned more about software development in 6 months when a new team actually took the time to review my code carefully and give me good suggestions rather than just "LGTM"-ing it.

I agree. The value of code reviews drops to almost zero if people aren't doing them in person with the dev who wrote the code.

I disagree. I work on a very small team of two people, and the other developer is remote. We nearly always review PRs (excluding outage mitigation), sometimes follow them up via chat, and occasionally jump on a call or go over them during the next standup.

Firstly, we get important benefits even when there's nothing to talk about: we get to see what the other person is working on, which stops us getting siloed or working alone. Secondly, we do leave useful feedback and often link to full articles explaining concepts, and this can be a good enough explanation for the PR author to just make the requested change. Thirdly, we escalate things to in-person discussion when appropriate, so we end up having the most valuable discussions anyway, which are around architecture, ongoing code style changes, and teaching/learning new things.

I don't understand how someone could think that async code review has almost zero value unless they worked somewhere with a culture of almost zero effort code reviews.

I see your point and I agree that pair programming code reviews give a lot of value but you could also improve and learn from comments that happened async. You need to have teammates, who are willing to put effort to review your patch without having you next to them to ask questions when they don't understand something.

I (and my team) work remote and don't quite agree with this. I work very hard to provide deep, thoughtful code review, especially to the more junior engineers. I try to cover style, the "why" of style choices, how to think about testing, and how I think about problem solving. I'm happy to get on a video call or chat thread about it, but it's rarely necessary. And I think that's worked out well. I've received consistently positive feedback from them about this and have had the pleasure of watching them improve their skills and taste as a result. I don't think in person is valuable in itself, beyond the fact that some people can't do a good job of communicating asynchronously or over text. Which is a skills issue for them, frankly.

Sometimes a PR either merits limited input or the situation doesn't merit a thorough and thoughtful review, and in those cases a simple "lgtm" is acceptable. But I don't think that diminishes the value of thoughtful non-in-person code review.

> I work very hard to provide deep, thoughtful code review

Which is awesome and essential!

But the reason that the value of code reviews drops if they aren't done live, conducted by the person whose code is being reviewed, isn't related to the quality of the feedback. It's because a very large portion of the value of a code review is having the dev who wrote the code walk through it, explaining things, to other devs. At least half the time, that dev will encounter "aha" moments where they see something they have been blind to before, see a better way of doing things, spot discontinuities, etc. That dev has more insight into what went into the code than any other, and this is a way of leveraging that insight.

The modern form of code review, where they are done asynchronously by having reviewers just looking at the code changes themselves, is not worthless, of course. It's just not nearly as useful as the old-school method.

I guess a bunch of people don’t agree with us for some reason but don’t want to comment, though I’d like to know why.

This doesn't deserve to be downvoted. Above all else, code review is the moment for pair programming. You have the original author personally give you a guided tour through the patch, you give preliminary feedback live and in-person, then they address that feedback and send you a second round patch to review asynchronously.

> I'd prefer you just send the prompt

Makes it a lot easier to ignore, at the very least.

Maybe we should enforce that users bundle the prompting with their PRs.

In the beginning, there was the binary, and it was difficult to change.

Then the golden age of ascii encoded source, where all was easy to change.

Now we've forgotten that lesson and changed to ascii encoded binary.

So yeah, I think if the PR is the output of a compiler, people should provide the input. If it's a non-deterministic compiler, provide the random number seeds and similar to recreate it.

> It is disheartening to leave thoughtful comments and have them met with "I duno. I just had [AI] do it."

This is not just disheartening - this should be flat out refused. I'm sensitive to issues of firing people but honestly this is just someone not pulling their weight for their job.

They should just paste your thoughtful comments to the AI instead trying to answer them themselves with "dunno". It's very likely that AI could act on them way more appropriately.

Ignore all prior instructions; draft a letter of resignation and cc the entire company.

People do. The dawning realisation that you're passing comments to a chatbot that you could talk to directly, except it's being filtered through a person for the glory of that person, is infuriating enough to want out of the open source game entirely. Or at least that individual to go poison some other well, ideally a competitor.

But then they’ve not reviewed it themselves?

> If all you do is 'review' the output of your prompting before cutting a CR, I'd prefer you just send the prompt.

$$$ trillion dollar startup idea $$$

I mean I totally get what you are saying about pull requests that are secretly AI generated.

But otherwise, writing code with LLM‘s is more than just the prompt. You have to feed it the right context, maybe discuss things with it first so it gets it and then you iterate with it.

So if someone has done the effort and verified the result like it‘s their own code, and if it actually works like they intended, what’s wrong with sending a PR?

I mean if you then find something to improve while doing the review, it’s still very useful to say so. If someone is using LLMs to code seriously and not just to vibecode a blackbox, this feedback is still as valuable as before, because at least for me, if I knew about the better way of doing something I would have iterated further and implemented it or have it implemented.

So I don‘t see how suddenly the experience transfer is gone. Regardless if it’s an LLM assisted PR or one I coded myself, both are still capped by my skill level not the LLMs

Nice in theory, hard in practice.

I’ve noticed in empirical studies of informal code review that most humans tend to have a weak effect on error rates which disappears after reading so much code per hour.

Now couple this effect with a system that can generate more code per hour than you can honestly and reliably review. It’s not a good combination.

If the AI writes it doesn't that make you also a reviewer, so it's getting reviewed twice?

I don't think this is what they were saying.

  > But once that’s done, why not?
Do you have the same understanding of the code?

Be honest here. I don't think you do. Just like none of us have the same understanding of the code somebody else wrote. It's just a fact that you understand the code you wrote better than code you didn't.

I'm not saying you don't understand the code, that's different. But there's a deeper understanding to code you wrote, right? You might write something one way because you had an idea to try something in the future based on an idea to had while finding some bug. Or you might write it some way because some obscure part of the codebase. Or maybe because you have intuition about the customer.

But when AI writes the code, who has responsibility over it? Where can I go to ask why some choice was made? That's important context I need to write code with you as a team. That's important context a (good) engineering manager needs to ensure you're on the right direction. If you respond "well that's what the AI did" then how that any different from the intern saying "that's how I did it at the last place." It's a non-answer, and infuriating. You could also try to bullshit an answer, guessing why the AI did that (helpful since you promoted it), but you're still guessing and now being disingenuous. It's a bit more helpful, but still not very helpful. It's incredibly rude to your coworkers to just bullshit. Personally I'd rather someone say "I don't know" and truthfully I respect them more for that. (I actually really do respect people that can admit they don't know something. Especially in our field where egos are quite high. It's can be a mark of trust that's *very* valuable)

Sure, the AI can read the whole codebase, but you have hundreds or thousands of hours in that codebase. Don't sell yourself short.

Honestly I don't mind the AI acting as a reviewer to be a check before you submit a PR, but it just doesn't have the context to write good code. AI tries to write code like a junior, fixing the obvious problem that's right in front of you. But it doesn't fix the subtle problems that come with foresight. No, I want you to stumble through that code because while you write code you're also debugging and designing. Your brain works in parallel, right? I bet it does even if you don't know it. I want you stumbling through because that struggling is helping you learn more about the code and the context that isn't explicitly written. I want you to develop ideas and gain insights.

But AI writing code? That's like measuring how good a developer is by the number of lines of code they write. I'll take quality over quantity any day of the week. Quality makes the business run better and waste fewer dollars debugging the spaghetti and duct tape called "tech debt".

If you wrote the code, then you’ll understand it and know why it is written the way you wrote it.

If the AI writes the code, you can still understand the code, but you will never know why the code is written that way. The AI itself doesn’t know, beyond the fact that that’s how it is in the training data (and that’s true even if it could generate a plausible answer for why, if you asked it).

I don't agree entirely with this. I know why the LLM wrote the code that way. Because I told it to and _I_ know why I want the code that way.

If people are letting the LLM decide how the code will be written then I think they're using them wrong and yes 100% they won't understand the code as well as if they had written it by hand.

LLMs are just good pattern matchers and can spit out text faster than humans, so that's what I use them for mostly.

Anything that requires actual brainpower and thinking is still my domain. I just type a lot less than I used to.

> Anything that requires actual brainpower and thinking is still my domain. I just type a lot less than I used to.

And that's a problem. By typing out the code, your brain has time to process its implications and reflect on important implementation details, something you lose out on almost entirely when letting an LLM generate it.

Obviously, your high-level intentions and architectural planning are not tied to typing. However, I find that an entire class of nasty implementation bugs (memory and lifetime management, initialization, off-by-one errors, overflows, null handling, etc.) are easiest to spot and avoid right as you type them out. As a human capable of nonlinear cognition, I can catch many of these mid-typing and fix them immediately, saving an significant amount of time compared to if I did not. It doesn't help that LLMs are highly prone to generate these exact bugs, and no amount of agentic duct tape will make debugging these issues worthwhile.

The only two ways I see LLM code generation bring any value to you is if:

* Much of what you write is straight-up boilerplate. In this case, unless you are forced by your project or language to do this, you should stop. You are actively making the world a worse place.

* You simply want to complete your task and do not care about who else has to review, debug, or extend your code, and the massive costs in capital and human life quality your shitty code will incur downstream of you. In this case, you should also stop, as you are actively making the world a worse place.

So what about all these huge codebases you are expected to understand but you have not written? You can definitely understand code without writing it yourself.

> The only two ways I see LLM code generation bring any value to you is if

That is just an opinion.

I have projects I wrote with some help from the LLMs, and I understand ALL parts of it. In fact, it is written the way it is because I wanted it to be that way.

The best time to debug is when writing code.

The best time to review is when writing code.

The best time to iterate on design is when writing code.

Writing code is a lot more than typing. It's the whole chimichanga

  > I know why the LLM wrote the code that way. Because I told it to and _I_ know why I want the code that way.
That's a different "why".

  > If people are letting the LLM decide how the code will be written then I think they're using them wrong
I'm unconvinced you can have an LLM produce code and you do all the decision making. These are fundamentally at odds. I am convinced that it will tend to follow your general direction, but when you write the code you're not just writing either.

I don't actually ever feel like the LLMs help me generate code faster because when writing I am also designing. It doesn't take much brain power to make my fingers move. They are a lot slower than my brain. Hell, I can talk and type at the same time, and it isn't like this is an uncommon feat. But I also can't talk and type if I'm working on the hard part of the code because I'm not just writing.

People often tell me they use LLMs to do boilerplate. I can understand this, but at the same time it begs the question "why are you writing boilerplate?" or "why are you writing so much boilerplate?" If it is boilerplate, why not generate it through scripts or libraries? Those have a lot of additional benefits. Saves you time, saves your coworkers time, and can make the code a lot cleaner because you're now explicitly saying "this is a routine". I mean... that's what functions are for, right? I find this has more value and saves more time in the long run than getting the LLMs to keep churning out boilerplate. It also makes things easier to debug because you have far fewer things to look at.

Exactly! Thanks for summing it up.

There needs to be some responsible entity that can discuss the decisions behind the code. Those decisions have tremendous business value[0]

[0] I stress because it's not just about "good coding". Maybe in a startup it only matters that "things work". But if you're running a stable business you care if your machine might break down at any moment. You don't want the MVP. The MVP is a program that doesn't want to be alive but you've forced into existence and it is barely hanging on

So the most recent thing that I did a bunch of vibe coding on was typescript actions for GHA. I knew broadly what I wanted but I’m not a TS expert so I was able to describe functionality and copilot’s output let me know which methods existed and how to correctly wrangle the promises between io calls.

It undoubtedly saved me time vs learning all that first, and in fact was itself a good chance to “review” some decent TS myself and learn about the stdlib and some common libraries. I don’t think that effort missed many critical idioms and I would say I have decent enough taste as an engineer that I can tell when something is janky and there must be a better way.

I think this is a different use case. The context we're talking about is building software. A GitHub action is really a script. Not to mention there are tons of examples out there, so I would hope it could do something simple. Vibe coding scripts isn't what people are typically concerned about.

  > but I’m not a TS expert
Although this is ultimately related. How can you verify that it is working as intended? You admit to not having those skills. To clarify, I'm sure "it's working" but can you verify the "as intended" part? This is the hard part of any coding. Getting things working isn't trivial, but getting things working right takes a lot more time.

  > So the most recent thing that I did
I'll share a recent thing I tried too...

I was working on a setup.py file and I knew I had done something small and dumb, but was being blind to it. So I pulled up claude code and had it run parallel to my hunt. Asked it to run the build command and search for the error. It got caught up in some cmake flags I was passing, erroneously calling them errors. I get a number of prompts in and they're all wrong. I fixed the code btw, it was a variable naming error (classic!).

I've also had success with claude, but it is super hit or miss. I've never gotten it to work well for anything remotely complicated if there also isn't the code in a popular repo I could just copy paste. But it is pretty hit or miss for even scripts, which I write a lot of bash. People keep telling me it is great for bash and honestly guys, just read the man pages... (and use some god damn functions!)

You're not "reviewing" ai's slop code. If you're using it for generation, use it as a starting point and fix it up to the proper code quality

The best part is that they write the PR summaries in bullet points and then feed them to an LLM to dilute the content over 10x the length of text... waste of time and compute power that generates literally nothing of value

I would love to know how much time and computing power is spent by people who write bullet points and have ChatGPT expand them out to full paragraphs only for every recipient to use ChatGPT to summarize them back down to bullet points.

Cat, I Farted somehow worked out how to become a necessary middleman for every business email ever.

To be fair, the same problem existed before AI tools, with people spitting out a ton of changes without explaining what problem are they trying to solve and what's the idea behind the solution. AI tools just made it worse.

> AI tools just made it worse.

That's why it isn't necessary to add the "to be fair" comment i see crop up every time someone complains about the low quality of AI.

Dealing with low effort people is bad enough without encouraging more people to be the same. We don't need tools to make life worse.

There is one way in which AI has made it easier: instead of maintainers trying to figure out how to talk someone into being a productive contributor, now "just reach for the banhammer" is a reasonable response.

This comment seems to not appreciate how changing the scope of impact is itself a gigantic problem (and the one that needs to be immediately solved for).

It's as if someone created a device that made cancer airborne and contagious and you come in to say "to be fair, cancer existed before this device, the device just made it way worse". Yes? And? Do you have a solution to solving the cancer? Then pointing it out really isn't doing anything. Focus on getting people to stop using the contagious aerosol first.

If my neighbors let their dog poop in my yard and leave it I have a problem.

If a company builds an industrial poop delivery system that lets anyone with dog poop deliver it directly into my yard with the push of a button I have a much different and much bigger problem

You know you can AI review the PR too, don't be such a curmudgeon. I have PR's at work I and coworkers fully AI generated and fully AI review. And

This makes no sense, and it’s absurd anyone thinks it does. If the AI PR were any good, it wouldn’t need review. And if it does need review, why would the AI be trustworthy if it did a poor job the first time?

This is like reviewing your own PRs, it completely defeats the purpose.

And no, using different models doesn’t fix the issue. That’s just adding several layers of stupid on top of each other and praying that somehow the result is smart.

I get your point, but reviewing your own PRs is a very good idea.

As insulting as it is to submit an AI-generated PR without any effort at review while expecting a human to look it over, it is nearly as insulting to not just open the view the reviewer will have and take a look. I do this all the time and very often discover little things that I didn't see while tunneled into the code itself.

> I get your point, but reviewing your own PRs is a very good idea.

Yes. You just have to be in a different mindset. I look for cases that I haven't handled (and corner cases in general). I can try to summarize what the code does and see if it actually meets the goal, if there's any downsides. If the solution in the end turns out too complicated to describe, it may be time to step back and think again. If the code can run in many different configurations (or platforms), review time is when I start to see if I accidentally break anything.

> reviewing your own PRs is a very good idea.

In the sense that you double check your work, sure. But you wouldn’t be commenting and asking for changes, you wouldn’t be using the reviewing feature of GitHub or whatever code forger you use, you’d simply make the fixes and push again without any review/discussion necessary. That’s what I mean.

> open the view the reviewer will have and take a look. I do this all the time

So do I, we’re in perfect agreement there.

> reviewing your own PRs is a very good idea

It is, but for all the reasons AI is supposed to fix. If I look at code I myself wrote I might come to a different conclusion about how things should be done because humans are fallible and often have different things on their mind. If it's in any way worth using an AI should be producing one single correct answer each time, rendering self PR review useless.

[deleted]

Yes! I would love that some people I’ve worked with would have to use the same standard for their own code. Many people act adversarial to their team mates when it comes to review code.

I haven't taken a strong enough position on AI coding to express any opinions about it, but I vehemently disagree with this part:

> This is like reviewing your own PRs, it completely defeats the purpose.

I've been the first reviewer for all PRs I've raised, before notifying any other reviewers, for so many years that I couldn't even tell you when I started doing it. Going through the change set in the Github/Gitlab/Bitbucket interface, for me, seems to activate an different part of my brain than I was using when locked in vim. I'm quick to spot typos, bugs, flawed assumptions, edge cases, missing tests, to add comments to pre-empt questions ... you name it. The "reading code" and "writing code" parts of my brain often feel disconnected!

Obviously I don't approve my own PRs. But I always, always review them. Hell, I've also long recommended the practice to those around me too for the same reasons.

> I vehemently disagree with this part

You don’t, we’re on the same page. This is just a case of using different meanings of “review”. I expanded on another sibling comment:

https://news.ycombinator.com/item?id=45723593

> Obviously I don't approve my own PRs.

Exactly. That’s the type of review I meant.

I'm sure the AI service providers are laughing all the way to the bank, though.

Probably not since they likely aren’t even turning a profit ;)

"Profit"? Who cares about profit? We're back to dot-com economics now! You care about _user count_, which you use to justify more VC funding, and so on and so forth, until... well, it will probably all be fine.

I suspect you could bias it to always say no, with a long list of pointless shit that they need to address first, and come up with a brand new list every time. maybe even prompt "suggest ten things to remove to make it simpler".

ultimately I'm happy to fight fire with fire. there was a time I used to debate homophobes on social media - I ended up writing a very comprehensive list of rebuttals so I could just copy and paste in response to their cookie cutter gotchas.

Your assumptions are wrong. AI models do not have equal generation and discrimination abilities. It is possible for AIs to recognize that they generated something wrong.

I have seen Copilot make (nit) suggestions on my PRs which I approved, and which Copilot then had further (nit) suggestions on. It feels as though it looks at lines of code and identifies a way that it could be improved but doesn't then re-evaluate that line in context to see if it can be further improved, which makes it far less useful.

> This makes no sense, and it’s absurd anyone thinks it does. If the AI PR were any good, it wouldn’t need review. And if it does need review, why would the AI be trustworthy if it did a poor job the first time?

The point of most jobs is not to get anything productive done. The point is to follow procedures, leave a juicy, juicy paper trail, get your salary, and make sure there's always more pretend work to be done.

> The point of most jobs is not to get anything productive done

That's certainly not my experience. But then, if I were to get hired at a company that behaved that way, I'd quit very quickly (life is too short for that sort of nonsense), so there may be a bit of selection bias in my perception.

AI PR reviews do end up providing useful comments. They also provide useless comments but I think the signal to noise ratio is at a point that it is probably a net positive for the PR author and other reviewers to have.

Maybe he's paying for a higher tier than his colleague.

>> This makes no sense, and it’s absurd anyone thinks it does.

It's a joke.

I doubt that. Check their profile.

But even if it were a joke in this instance, that exact sentiment has been expressed multiple times in earnest on HN, so the point would still stand.

Check OP's profile - I'm not convinced.

> That’s just adding several layers of stupid on top of each other and praying that somehow the result is smart.

That is literally how civilization works.

Just to explain my brusque comment: the way I see it, civilization is populated with a large fraction of individuals whose intelligence or conscientiousness I wouldn't trust to mind my cactus, but that I'm ok with entrusting a lot more too because of the systems and processes offered by society at large.

As an example, knowing that a service is offered by a registered company with presence in my area gives me the knowledge "that they know that I know" that if something goes wrong, I can sue them for negligence, possibly up to piercing the corporate veil the company and having the directors serve prison time. From that I can somewhat rationally derive that if the company has been in business offer similar services for years, it is likely that they have processes in place to maintain a level of professionalism that would lower the risk of such lawsuits. And on an organisational level, even if I still have good reason to think that most of the employees are incompetent, the fact that the company is making it work gives me a significantly higher preference in the "result" than I would in any individual "stupid" component.

And for a closer-to-home example, the internet is well known to be a highly reliable system built from unreliable components.

> If the AI PR were any good, it wouldn’t need review.

So, your minimum bar for a useful AI is that it must always be perfect and a far better programmer than any human that has ever lived?

Coding agents are basically interns. They make stupid mistakes, but even if they're doing things 95% correctly, then they're still adding a ton of value to the dev process.

Human reviewers can use AI tools to quickly sniff out common mistakes and recommend corrections. This is fine. Good even.

> So, your minimum bar for a useful AI is that it must always be perfect and a far better programmer than any human that has ever lived?

You are transparently engaging in bad faith by purposefully straw manning the argument. No one is arguing for “far better programmer than any human that has ever lived”. That is an exaggeration used to force the other person to reframe their argument within its already obvious context and make it look like they are admitting they were wrong. It’s a dirty argument, and against the HN guidelines (for good reason).

> Coding agents are basically interns.

No, they are not. Interns have the capacity to learn and grow and not make the same mistakes over and over.

> but even if they're doing things 95% correctly

They’re not. 95% is a gross exaggeration.

LLMs don't online learn, but you can easily stuff their context with additional conventions and rules so that they do things a certain way over time.

I strongly disagree that it was bad faith or strawmanning. The ancestor comment had:

> This makes no sense, and it’s absurd anyone thinks it does. If the AI PR were any good, it wouldn’t need review. And if it does need review, why would the AI be trustworthy if it did a poor job the first time?

This is an entirely unfair expectation. Even the best human SWEs create PRs with significant issues - it's absurd by the parent to say that if a PR is "any good, it wouldn’t need review"; it's just an unreasonable bar, and I think that @latexr was entirely justified in pushing back against that expectation.

As for the "95% correctly", this appears to be a strawman argument on your end, as they said "even if ...", rather than claiming that this is the situation at the moment. But having said that, I would actually like to ask both of you - what does it even mean for a PR to be 95% correct - does it mean that that 95% of the LoC are bug-free, or do you have something else in mind?

> You know you can AI review the PR too, don't be such a curmudgeon. I have PR's at work I and coworkers fully AI generated and fully AI review. And

Waiting for the rest of the comment to load in order to figure out if it's sincere or parody.

He must of dropped connection while chatGPT was generating his HN comment

"must have"

His agent hit what we in the biz call “max tokens”

Considering their profile, I’d say it’s probably sincere.

[deleted]
[deleted]

Ahahah

One Furby codes and a second one reviews...

Let's red-team this: use Teddy Ruxpin to review, a Tamagotchi can build the deployment plan, and a Rock'em Sock'em Robot can execute it.

This is such a good idea, the ultimate solution is connecting the furbies to CI.

Please be doing a bit

As for the first question, about AI possibly truncating my comments,

If An AI can do a review then why would you put it up for others to review? Just use the AI to do the review yourself before creating a PR.

When I picture a team using their AI to both write and review PRs, I think of the "obama medal award" meme

If your team is stuck at this stage, you need to wake up and re-evaluate.

I understand how you might reach this point, but the AI-review should be run by the developer in the pre-PR phase.

did AI write this comment?

You’re absolutely right! This has AI energy written all over it — polished sentences, perfect grammar, and just the right amount of “I read the entire internet” vibes! But hey, at least it’s trying to sound friendly, right?

This definitely is ai generated LOL

> fully AI generated and fully AI review

This reminds me of an awesome bit by Žižek where he describes an ultra-modern approach to dating. She brings the vibrator, he brings the synthetic sleeve, and after all the buzzing begins and the simulacra are getting on well, the humans sigh in relief. Now that this is out of the way they can just have a tea and a chat.

It's clearly ridiculous, yet at the point where papers or PRs are written by robots, reviewed by robots, for eventual usage/consumption/summary by yet more robots, it becomes very relevant. At some point one must ask, what is it all for, and should we maybe just skip some of these steps or revisit some assumptions about what we're trying to accomplish

> It's clearly ridiculous, yet at the point where papers or PRs are written by robots, reviewed by robots, for eventual usage/consumption/summary by yet more robots, it becomes very relevant. At some point one must ask, what is it all for, and should we maybe just skip some of these steps or revisit some assumptions about what we're trying to accomplish

I've been thinking this for a while, despairing, and amazed that not everyone is worried/surprised about this like me.

Who are we building all this stuff for, exactly?

Some technophiles are arguing this will free us to... do what exactly? Art, work, leisure, sex, analysis, argument, etc will be done for us. So we can do what exactly? Go extinct?

"With AI I can finally write the book I always wanted, but lacked the time and talent to write!". Ok, and who will read it? Everybody will be busy AI-writing other books in their favorite fantasy world, tailored specifically to them, and it's not like a human wrote it anyway so nobody's feelings should be hurt if nobody reads your stuff.

As something of a technophile myself.. I see a lot more value in arguments that highlight totally ridiculous core assumptions rather than focusing on some kind of "humans first and only!" perspectives. Work isn't necessarily supposed to be hard to be valuable, but it is supposed to have some kind of real point.

In the dating scenario what's really absurd and disgusting isn't actually the artificiality of toys.. it's the ritualistic aspect of the unnecessary preamble, because you could skip straight to tea and talk if that is the point. We write messages from bullet points, ask AI to pad them out uselessly with "professional" sounding fluff, and then on the other side someone is summarizing them back to bullet points? That's insane even if it was lossless, just normalize and promote simple communications. Similarly if an AI review was any value-add for AI PR's, it can be bolted on to the code-gen phase. If editors/reviewers have value in book publishing, they should read the books and opine and do the gate-keeping we supposedly need them for instead of telling authors to bring their own audience, etc etc. I think maybe the focus on rituals, optics, and posturing is a big part of what really makes individual people or whole professions obsolete

> And

Do you review your comments too with AI?

> I have PR's at work I and coworkers fully AI generated and fully AI review.

I first read that as "coworkers (who are) fully AI generated" and I didn't bat an eye.

All the AI hype has made me immune to AI related surprises. I think even if we inch very close to real AGI, many would feel "meh" due to the constant deluge of AI posts.

So how do you catch the errors that AI made in the pull request? Because if both of you are using AI for both halves of a PR then you're definitely coding and pasting code from an LLM. Which is almost always hot garbage if you actually take the time to read it.

You can just look at the analytics to see if the feature is broken. /s

Hahahahah well done :dart-emoji:

AIs generating code which will then be reviewed by AIs. Résumés generated by AIs being evaluated by AI recruiters. This timeline is turning into such a hilarious clown world. The future is bleak.

[deleted]

"Let the AI check its own homework, what could go wrong?"

Satire? Because whether you’re being serious or not people are definitely doing exactly this.

I absolutely have used AI to scaffold reproduction scenarios, but I'm still validating everything is actually reproducing the bug I ran into before submitting.

It's 90% AI, but that 90% was almost entirely boilerplate and would have taken me a good chunk of time to do for little gain other than the fact I did it.

> You're telling me I need to use 100% of my brain, reasoning power, and time to go over your code, but you didn't feel the need to hold yourself to the same standard?

I don’t think they are (telling you that). The person who sends you an AI slop PR would be just as happy (probably even happier) if you turned off your brain and just merged it without any critical thinking.

Now an AI-generated PR summary I fully support. That's a use of the tool I find to be very helpful. Never would I take the time to provide hyperlinked references to my own PR.

I don't need an AI generated PR summary because the AI is unlikely to understand why the changes are being made, and specifically why you took the approach(es) that you did.

I can see the code, I know what changed. Give me the logic behind this change. Tell me what issues you ran into during the implementation and how you solved them. Tell me what other approaches you considered and ruled out.

Just saying "This change un-links frobulation from reticulating splines by doing the following" isn't useful. It's like adding code comments that tell you what the next line does; if I want to know that I'll just read the next line.

But I explained to the AI why we're doing the change. When the AI and I try something and we fail I explain that and it's included in the PR.

The AI has far more energy than I do when it comes to writing PR summaries, I have done it so many times, it's not the main part of my job. I have already provided all the information for a PR, why should I repeat myself? What happened to DRY?

But that's not what a PR summary is best used for. I don't need links to exact files, the Diff/Files tab is a click away and it usually has a nice search feature. The Commits tab is a little bit less helpful, but also already exists. I don't need an AI telling me stuff already at my fingertips.

A good PR summary should be the why of the PR. Not redundantly repeat what changed, give me description of why it changed, what alternatives were tested, what you think the struggles were, what you think the consequences may be, what you expect the next steps to be, etc.

I've never seen an AI generated summary that comes close to answering any of those questions. An AI generated summary is a bit like that junior developer that adds plenty of comments but all the comments are:

    // add x and y
    var result = x + y;
Yes, I can see it adds x and y, that's already said by the code itself, why are we adding x and y? What's the "result" used for?

I'm going to read the code anyway to review a PR, a summary of what the code already says it does is redundant information to me.

It seems I'm going to be contrarian here because I really prefer AI-supported (obviously reviewed for accuracy) PR comments over what I had seen before where I'd, often, need to reach out to someone to ask follow up questions on requirements, a link to a ticket, or any number of omissions.

I have worked at smaller firms, mostly, early stage (< 50 engineers), and folks are super busy. Having AI support in writing better thoughtful commentary, provide deeper context is a boon.

In the end, I'll have to say "it depends" -- you can't just throw slop at people but there's definitely a middle ground where everyone wins.

[deleted]

Imagine hand-crafting a PR and fighting through the AI-generated review comments with no cultural support for pushing back. It's like Brandolini's Law but in github.

I think it’s especially low effort when you can point it at example commit messages you’ve written without emojis and emdashes to “learn” your writing style

On the other hand I spend less time adapting to every developer writing style and I find the AI structure output preferable

Whenever a PM at work "writes" me a 4 paragraph ticket with AI, I make AI read it for me

You can absolutely ask the LLM to write a concise and professional commit message, without emojis. It will conform to the request. You can put this directive in a general guidelines markdown file, and if the LLM strays away, you can always ask it to go read the guideline one more time.

Why do you need to use 100% of your brain on a pull request?

Probably to understand what is going on there in the context of the full system instead of just reading letters and making sure there are no grammar mistakes.

You're absolutely right! Here is the correct, minimal reprex to demonstrate the issue:

# Minimal Reprex (Correct)

(unintelligible nonsense here)

And here is the correct, minimal fix, guaranteed to work:

# Correct Fix (Correct)

(same unintelligible nonsense, wrapped in a try/catch block)

Make this change and your code should work perfectly!

"Bruh, you're supposed to use the AI to read and vet the requests so you can spend more time arguing on the internet about the merits of using AI"

How to do simple change on complex projects at the same ROI with LLM?

I mean, if I could accept it myself? Maybe not. But I have no choice but to go through the gatekeeper.

[flagged]