> It sucks if you bisect and find the change happened in some enormous incohesive commit.

But why are any PRs like this? Each PR should represent an atomic action against the codebase - implementing feature 1234, fixing bug 4567. The project's changelog should only be updated at the end of each PR. The fact that I went down the wrong path three times doesn't need to be documented.

> Each PR should represent an atomic action against the codebase

We can bikeshed about this for days. Not every feature can be made in an atomic way.

That's true, some are big and messy, or the change has to be created across a couple of PRs, but I don't think that the answer to "some PRs are messy" is "let's include all the mess". I don't think the job is made easier by having to dig through a half dozen messy commits to find where the bug is as opposed to one or two large ones.

> I don't think that the answer to "some PRs are messy" is "let's include all the mess"

Hey look at us, two alike thinking people! I never said "let's include all the mess".

Looking at the other extreme someone in this thread said they didn't want other people to see the 3 attempts it took to get it right. Sure if it's just a mess (or, since this is 2025, ai slop) squash it away. But in some situations you want to keep a history of the failed attemps. Maybe one of them was actually the better solution but you were just short of making it work, or maybe someone in the future will be able to see that method X didn't work and won't have to find out himself.

I can see the intent, but how often do people look through commit history to learn anything beside "when did this break and why"? If you want lessons learned put it in a wiki or a special branch.

Main should be a clear, concise log of changes. It's already hard enough to parse code and it's made even harder by then parsing versions throughout the code's history, we should try to minimize the cognitive load required to track the number of times something is added and then immediately removed because there's going to be enough of that already in the finished merges.

> If you want lessons learned put it in a wiki or a special branch.

You already have the information in a commit. Moving that to another database like a wiki or markdown file is work and it is lossy. If you create branches to archive history you end up with branches that stick around indefinitely which I think most would feel is worse.

> Main should be a clear, concise log of changes.

No, that's what a changelog is for.

You can already view a range of commits as one diff in git. You don't need to squash them in the history to do that.

I am beginning to think that the people who advocate for squashing everything have `git commit` bound to ctrl+s and smash that every couple minutes with an auto-generated commit message. The characterization that commits are necessarily messy and need to be squashed as to "minimize the cognitive load" is just not my experience.

Nobody who advocates for squashing even talks about how they reason about squashing the commit messages. Like it doesn't come into their calculation. Why is that? My guess is, they don't write commit messages. And that's a big reason why they think that commits have high "cognitive load".

Some of my commit messages are longer than the code diffs. Other times, the code diffs are substantial and there are is a paragraph or three explaining it in the commit message.

Having to squash commits with paragraphs of commit messages always loses resolution and specificity. It removes context and creates more work for me to try to figure out how to squash it in a way where the messages can be understood with the context removed by the squash. I don't know why you would do that to yourself?

If you have a totally different workflow where your commits are not deliberate, then maybe squashing every merge as a matter of policy makes sense there. But don't advocate that as a general rule for everyone.

Commits aren't necessarily messy, but they're also not supposed to be necessarily clean. There's clearly two different work flows here.

It seems some people treat every commit like it's its own little tidy PR, when others do not. For me, a commit is a place to save my WIP when I'm context switching, or to create a save point when I know my code works so that I can revert back to that if something goes awry during refactoring, it's a step on the way to completing my task. The PR is the final product to be reviewed, it's where you get the explanation. The commits are imperfect steps along the way.

For others, every commit is the equivalent of a PR. To me that doesn't make a lot of sense - now the PR isn't an (ideal world) atomic update leading to a single goal, it's a digest of changes, some of which require paragraphs of explanation to understand the reasoning behind. What happens if you realize that your last commit was the incorrect approach? Are you constantly rebasing? Is that the reader's problem? Sure, that happens with PRs as well, but again, that's the difference in process - raising a PR requires a much higher standard of completion than a commit.

You say "two different work flows here" and I think perhaps a better way of considering this is as having multiple _kinds_ of history.

Most of the time, I don't have a clean path through a piece of work such that I can split it out into beautifully concise commits with perfect commit messages. I have WIP commits, messy changes, bad variable names, mistakes, corrections, real corrections, things that I expect everyone does. I commit them every one of them. This is my "private history" or my scratch work.

After I've gotten to the end and I'm satisfied that I've proven my change does what its supposed to do (i.e. tests demonstrate the code), I can now think about how I would communicate that change to someone else.

When I in this stage, it sometimes leads to updating of names now that I'm focussing on communicating my intention. But I figure out how to explain the end result in broad strokes, and subdivide where necessary.

From there, I build "public history" (leveraging all the git tricks). This yields pieces that are digestible and briefly illuminated with a commit message. Some pieces are easy to review at a glance; some take some focus; some are large; some are small.

But key is that the public history is digestible. You can have large, simple changes (e.g. re-namings, package changes) that, pulled out as a separate commit, can be reviewed by inspection. You can have small changes that take focus to understand, but are easily called out for careful attention in a single commit (and divorced from other chaff).

By having these two sorts of history, I can develop _fearlessly_. I don't care what the history looks like as I'm doing it, because I have the power (through my use of git) to construct a beautiful exposition of development that is coherent.

A PR being messy and cluttered is a choice. History can be _easily_ clarified. Anyone who uses git effectively should be able to take a moment to present their work more like framed artwork, and not a finger-painted mess stuck to the refrigerator.

> The commits are imperfect steps along the way.

The workflow you're describing here is fine for staging or stashing or commits that you don't intend to publish. I'll sometimes commit something unfinished, then make changes and either stash those if it doesn't work out, or commit with --amend (and sometimes --patch) to make a single commit that is clean and coherent. The commits aren't always small, but they're meaningful and it's easier to do this along the way than at the very end when it's not so clear or easy to remember all the details from commits that you made days ago.

> It seems some people treat every commit like it's its own little tidy PR

Pull requests are not always "little". But I'm nitpicking.

It sounds like a big difference between the workflows is that you don't amend or stash anything locally along the way. I guess I find it easier and more valuable to edit and squash changes locally into commits before publishing them. Instead of at the very end as a single pull request. For me, I can easily document my process with good commit messages when everything is fresh in my mind.

The end result is that some commits can be pretty big. Sometimes there is no good opportunity to break them down along the way. That is part of the job. But the end result is that these problems concerning a messy history shouldn't be there. The commits I write should be written with the intent of being meaningful for those who might read it. And it's far easier to do that along the way then at the end when it's being merged. So it's difficult to understand some of the complaints people make when they say it's confusing or messy or whatever.

Even if requirements change along the way while the pull request is open, and more commits are added as a response. I've just never had a situation come up where I'm blaming code or something and looking back at the commit history and struggling to understand what was going on in a way where it would be more clear if it had been squashed. Because the commits are already clean and, again, it's just easier to do this along the way.

But again, I use `git commit --verbose --amend --patch` liberally to publish commits intentionally. And that's why it feels like a bit of busywork and violence when people advocate for squashing those.

it is not a philosophical debate against the auto-squash. It is just auto-squash deletes potentially useful data automatically and provides zero benefit? Or what is the benefit?

1. pr message doesn't contain _all_ intent behind each change while the commit did cover the intent behind a technical decision. would you put everything in the pr message? why? it will just be a misleading messy comment for unrelated architectural components.

2. you branch from a wip branch - why? just. - now when the original branch is merged you can't rebase as github/lab messed up the parents.

3. interactive rebase before merge works fine for the wip wip wip style coding. But honestly wip wip wip style happens on bad days, not on days where you are executing a plan.

4. most importantly: you can read the pr message stuff if you filter on merge commits, but if you auto-squash you lose information.

I can see your point and sometimes I myself include PoC code as commented out block that I clean up in a next PR incase it proves to be useful.

But the fact is your complete PR commit history gives most people a headache unless it's multiple important fixes in one PR for conveniency's sake. Happens at least for me very rarely. Important things should be documented in say a separate markdown file.

This simply isn’t true unless you have to put everything in one commit?

To be honest, I usually get this with people who have never realized that you can merge dead code (code that is never called). You can basically merge an entire feature this way, with the last PR “turning it on” or adding a feature flag — optionally removing the old code at this point as well.

So maintaining old and new code for X amounts of time? That sounds acceptable in some limited cases, and terrible in many others. If the code is being changed for another reason, or the new feature needs to update code used in many places, etc. It can be much more practical to just have a long-lived branch, merge changes from upstream yourself, and merge when it's ready.

My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.

> So maintaining old and new code for X amounts of time?

No more than normal? Generally speaking, the author working on the feature is the only one who’s working on the new code, right? The whole team can see it, but generally isn’t using it.

> If the code is being changed for another reason, or the new feature needs to update code used in many places, etc. It can be much more practical to just have a long-lived branch, merge changes from upstream yourself, and merge when it's ready.

If you have people good at what they do ... maybe. I’ve seen this end very badly due to merge artefacts, so I wouldn’t recommend doing any merges, but rebasing instead. In any case, you can always copy a function to another function: do_something_v2(). Then after you remove the v1, remove the v2 prefix. It isn’t rocket science.

> My industry is also fairly strictly regulated and we plainly cannot do that even if we wanted to, but that's admittedly a niche case.

I can’t think of any regulations in any country (and I know of a lot of them) that dictate how you do code changes. The only thing I can think of is your own company’s policies in relation to those regulations; in which case, you can change your own policies.

Medical industry, code that gets shipped has to be documented, even if it's not used. It doesn't mean we can't ship unused code, it just means it's generally a pretty bad idea to do it. Maybe the feature's requirement might change during implementation, or you wanted to do a minor version release but that dead code is for a feature that needs to go into a major version (because of regulations).

> I can’t think of any regulations in any country (and I know of a lot of them) that dictate how you do code changes

https://blog.johner-institute.com/regulatory-affairs/design-...

That document doesn’t say that, as far as I can tell. If you’re using a compiled language, the dead code likely gets removed anyway, it is never shipped.

[deleted]

Our regulatory compliance regime hates it when we run non-main branches in production and specifically requires us to use feature flagging in order to delay rollouts of new code paths to higher-risk markets. YMMV.

> Each [X] should represent an atomic action against the codebase

That's called a commit. Not sure why some insist on replacing commits with vendor lock-in with less tooling and calling it progress.

yes, that would be ideal. especially in a world with infrastructure tied so closely to the application this standard cannot always be met for many teams.

Yeah "should" is often not reality, BUT I'm arguing that not squashing doesn't make things better.