Another one from my personal experience: apply DRY principles (don't repeat yourself) the third time you need something. Or in other words: you're allowed to copy-and-paste the same piece of code in two different places.
Far too often we generalise a piece of logic that we need in one or two places, making things more complicated for ourselves whenever they inevitably start to differ. And chances are very slim we will actually need it more than twice.
Premature generalisation is the most common mistake that separates a junior developer from an experienced one.
The rule of 3 is awful because it focuses on the wrong thing. If two instances of the same logic represent the same concept, they should be shared. If 10 instances of the same logic represent unrelated concepts, they should be duplicated.
The goal is to have code that corresponds to a coherent conceptual model for whatever you are doing, and the resulting codebase should clearly reflect the design of the system. Once I started thinking about code in these terms, I realized that questions like "DRY vs YAGNI" were not meaningful.
Of course, the rule of 3 is saying that you often _can't tell_ what the shared concept between different instances is until you have at least 3 examples.
It's not about copying identical code twice, it's about refactoring similar code into a shared function once you have enough examples to be able to see what the shared core is.
But don’t let the rule of 3 be an excuse for you to not critically assess the abstract concepts that your program is operating upon and within.
I too often see junior engineers (and senior data scientists…) write code procedurally, with giant functions and many, many if statements, presumably because in their brain they’re thinking about “1st I do this if this, 2nd I do that if that, etc”.
3 just seems arbitrary in practice though. In my job we share code when it makes sense and don’t when it doesn’t, and that serves us just fine
I agree. And I think this also distills down to Rob Pike’s rule 5, or something quite like it. If your design prioritizes modeling the domain’s data, shaping algorithms around that model, it’s usually trivial to determine how likely some “duplication” is operating on shared concepts, versus merely following a similar pattern. It may even help you refine the data model itself when confronted with the question.
Agreed. DRY is a compression algorithm. The rule of 3 is a bad compression algorithm. Good abstraction is not compression at all.
The devil’s in the details, as usual. No rule should be followed to the letter, which is what the top comment was initially complaining about.
Yet again, understanding when to follow a rule of thumb or not is another thing that separates the junior from the senior.
> If two instances of the same logic represent the same concept, they should be shared. If 10 instances of the same logic represent unrelated concepts, they should be duplicated.
Exactly.
I think we should not even generalize it down to a rule of three, because then you're outsourcing your critical thinking to a rule rather than doing the thinking yourself.
Instead, I tend to ask: if I change this code here, will I always also need to change it over there?
Copy-paste is good as long as I'm just repeating patterns. A for loop is a pattern. I use for loops in many places. That doesn't mean I need to somehow abstract out for loops because I'm repeating myself.
But if I have logic that says that button_b.x = button_a.x + button_a.w + padding, then I should make sure that I only write that information down once, so that it stays consistent throughout the program.
The reason for the rule of thumb is because you don't know whether you will need to change this code here when you change it there until you've written several instances of the pattern. Oftentimes different generalizations become appropriate for N=1, N=2, N>=3 && N <= 10, N>=10 && N<=100, and N>=100.
Your example is a pretty good one. In most practical applications, you do not want to be setting button x coordinates manually. You want to use a layout manager, like CSS Flexbox or Jetpack Compose's Row or Java Swing's FlowLayout, which takes in a padding and a direction for a collection of elements and automatically figures out where they should be placed. But if you only have one button, this is overkill. If you only have two buttons, this is overkill. If you have 3 buttons, you should start to realize this is the pattern and reach for the right abstraction. If you get to 10 buttons, you'll realize that you need to arrange them in 2D as well and handle how they grow & shrink as you resize the window, and there's a good chance you need a more powerful abstraction.
> Instead, I tend to ask: if I change this code here, will I always also need to change it over there?
IMO, this is the exact (and arguably only) question to ask.
Critical thinkers understand that rules aren't written for critical thinkers; that they are written for beginners who don't yet have the necessary experience to be able to think critically.
IMO, the right way to think about DRY is to consider why a given piece of code would ever change.
If you have two copies of some piece of code, and you can reasonably say that if you ever want to update one copy then you will almost certainly want to update the other copy as well, then it's probably a good idea to try to merge them and keep that logic in some centralized place.
On the other hand, if you have three copies of the same piece of code, but they kind of just "happen to" be identical and it's completely plausible that any one of the copies will be modified in the future for reasons which won't affect the other copies, maybe keeping them separate is a good idea.
And of course, it's sometimes worth it to keep two or more different copies which do share the same "reason to change". This is especially clear when you have the copies in different repositories, where making the code "DRY" would mean introducing dependencies between repositories which has its own costs.
I really like Casey Muratori's "[Semantic] Compression-oriented programming" - which is the philosophical backing of "WET" (Write Everything Twice) counterpart to DRY.
https://caseymuratori.com/blog_0015
It’s not how many times, it’s what you do about it. DRY doesn’t mean you have to make abstractions for everything. It means you don’t repeat yourself. That is, if two pieces of code are identical, chances are one of them shouldn’t exist. There are a lot of simple ways you might be able to address that, starting from the most obvious one, which is to just literally delete one of them. Abstraction should be about the last tool you reach for, but for most people it’s unfortunately the first.
This is so true. I have been burned by this more times than I can count. You see two functions that look similar, you extract a shared utility, and then six months later one of them needs a slightly different behavior and now you are fighting your own abstraction instead of just changing one line in a copy. The rule of three is a good default. Let the pattern prove itself before you try to generalize it.
I had a situation where we need to implement a protocol. The spec was fairly decent but the public implementations of the other end were slightly non compliant which necessitated special casing. Plus multiple versions etc.
An expensive consultant suggested creating pristine implementation and then writing a rule layer that would modify things as needed and deploying the whole thing as a pile of lamdba functions.
I copy pasted the protocol consumer file per producer and made all the necessary changes with proper documentation and mocks. Got it working quickly and we could add new ones without affecting.
If I'd try to keep it DRY, i think it would be a leaky mess.
The instances should be based on the context. For example we had a few different API providers for the same thing, and someone refactored the separate classes into a single one that treats all of the APIs the same.
Well, turns out that 3 of the APIs changed the way they return the data, so instead of separating the logic, someone kept adding a bunch of if statements into a single function in order to avoid repeating the code in multiple places. It was a nightmare to maintain and I ended up completely refactoring it, and even tho some of the code was repeated, it was much easier to maintain and accommodate to the API changes.
I think this is a reasonable rule of thumb, but there are also times that the code you are about to write a second time is extremely portable and can easily be made reusable (say less than 5 minutes of extra time to make the abstraction). In these cases I think it's worth it to go ahead and do it.
Having identical logic in multiple places (even only 2) is a big contributor to technical debt, since if you're searching for something and you find it and fix it /once/ we often thing of the job as done. Then the "there is still a bug and I already fixed that" confusion is avoided by staying DRY.
The D stands for "dependency", the R stands for "regret" and I'm not sure what the Y stands for yet.
Yelling... it stands for yelling...
Mostly at the massive switch statements and 1000 line's of flow control logic that end up embedded someplace where they really dont belong in the worst cases.
You say that, but I've created plenty of production bugs because two different implementations diverge. Easier to avoid such bugs if we just share the implementation.
I've also seen a lot of production bugs because two things that appeared to be a copy/paste where actually conceptually different and making them common made the whole much more complex trying to get common code to handle things that diverged even though they started from the same place.
DRY follows WET (Write Everything Twice).
Agreed, I think even Carmack advocates this rule
My rule of thumb is “when I have to make changes to this later, how annoying is it going to be to make the same change in multiple places?”
Sometimes four or five doesn’t seem too bad, sometimes two is too many
“Once, twice, automate/abstract” is a good general rule but you have to understand that the thing you’re counting isn’t appearances in the source code, it’s repetitions of the same logic in the same context. It’s gotta mean the same, not just look the same.
More critical in my mind is investigating the "inevitably start to differ" option.
If two pieces of code use the same functionality by coincidence but could possibly evolve differently then don't refactor. Don't even refactor if this happens three, four, or five times. Because even if the code may be identical today the features are not actually identical.
But if you have two uses of code that actually semantically identical and will assuredly evolve together then go ahead and refactor to remove duplication.
Depends on length and complexity, imho. If it's more than a line or two of procedure? Or involves anything counterintuitive? DRY at 2.
Extract a method or object if it's something that feels conceptually a "thing" even if it has only one use. Most tools to DRY your code also help by providing a bit of encapsulation that do a great job of tidying things up to force you to think about "should I be letting this out of domain stuff leak in here?"
As always, xkcd:
https://xkcd.com/1205/
https://xkcd.com/974/
Ehh, people who are really excited about DRY write unreadable convoluted code, where the bulk of the code is abstractions invented to avoid rewriting a small amount of code and unless you're very familiar with the codebase reasoning about what it actually does is a mystery because related pieces of functionality are very far away from each other.
DRY is not to avoid writing code (of any amount). DRY is a maintainability feature. "Unless you're very familiar with the code" you probably won't remember that you have to make this change in two places instead of one. DRY makes life easier for future you, and anyone else unfortunate to encounter (y)our mess.
You are confusing DRY done as intended vs what DRY looks like in the real world to many people.
Making maintainable code is a good goal.
DRY is one step removed from that goal and people use it to make very unmaintainable code because they confuse any repeated code with unmaintainability. (or their theory that some day we might want to repeat this code so we might as well pre-DRY it)
The result is often a horrendous complex mess. Imagine a cookbook with a cookie recipe that resided on 47 different pages (40 of which were pointers on where to find other pointers on where to find other pointers on where to find a step) in attempts to never write the same step twice in the whole book or your planned sequels in a 20 volume set.
It's almost like there's a "reasonable person" type of standard that's impossible to nail down in a general rule...
If you can describe a rule in one sentence it'll probably lead to as much trouble as it fixes.
The problem is zealots. Zealotry doesn't work for indeterminate things that require judgement like "code quality" or "maintainability", but a simple rule like "don't repeat yourself" is easy for a zeal. They take a rule and shut down any argument with "because the rule!"
If you're arguing about code quality and maintainability without one sentence rules then you actually have to make arguments. If the rule is your argument there's no discussion only dogma.
As a result? Easy to distill rules spread fast, breed zealots, and result in bad code.