The biggest issue I have with premature optimization is stuff that really doesn't matter.
For example, in Java I usually use ConcurrentHashMap, even in contexts that a regular HashMap might be ok. My reasoning for this is simple: I might want to use it in a multithreaded context eventually and the performance differences really aren't that much for most things; uncontested locks in Java are nearly free.
I've gotten pull requests rejected because regular HashMaps are "faster", and then the comments on the PR ends up with people bickering about when to use it.
In that case, does it actually matter? Even if HashMap is technically "faster", it's not much faster, and maybe instead we should focus on the thing that's likely to actually make a noticeable difference like the forty extra separate blocking calls to PostgreSQL or web requests?
So that's the premature optimization that I think is evil. I think it's perfectly fine at the algorithm level to optimize early.
Ironically to your point, I think adding a ConcurrentHashMap because it might be multithreaded eventually IS premature optimisation.
The work can be done in future to migrate to using ConcurrentHashMap when the feature to add multithreading support is added. There's no sense to add groundwork for unplanned, unimplemented features - that is premature optimisation in a nutshell.
Typing ten extra characters is an investment that pays off vs the overhead of doing any piece of work in the future. There's no real downside here, it doesn't make the code more complex, it gives less surprising results.
I think you are conflating YAGNI and premature optimisation and neither apply in this case.
I would agree if I were adding a dependency to Gradle or something. In this case it’s just something that conforms to the same interface as the other Map and is built into every Java for the last 20+ years. The max amount of effort I am wasting here is the ten extra characters I used to type “Concurrent”.
My point was that even if it is not optimal, there’s really no utility in bickering about it because it doesn’t really change anything. The savings from changing it to regular hashmap will, generously, be on the order of nanoseconds in most cases, and so people getting their panties in a bunch over it seems like a waste of time.
The point of the premature saying is to avoid additional effort in the name of unnecessary optimization, not to avoid optimization itself. Using a thing that's already right there because it might be better in some cases and is no more effort than the alternatives is not a problem.
I can fully understand "bickering" about someone sprinkling their favourite data type over a codebase which consistently used a standard data type before. The argument that it might be multithreaded in the futurure does not hold if the rest of the code base clearly was not writen with that in mind. That could even be counter productive, should someone get the misguided idea that it was ready for it.
Make a (very) good argument, and suggest a realtistic path to change the whole codebase, but don't create inconsistency just because it is "better". It is not.
I don’t expose it at the boundaries, just within functions. With everything outside of the function I take a Map interface in and/or return the Map out.
It makes no difference to the outside code.
You are communicating with future readers of the code. The presence of ConcurrentHashMap will lead future engineers into believing the code is threadsafe. This isn't true, and believing it is dangerous.
No, they'll believe that that specific map is thread safe.
That's really not much better. No function is an island, as wise man almost once said.
It’s actually a lot better. That’s literally the whole point of interfaces and polymorphism: to make it so the outside does not care about the implementation.
I think you are in the wrong. But my reason is that when you support concurrency, every access and modification must be checked more carefully. By using a concurrent Map you are making me review the code as if it must support concurrency which is much harder. So I say don’t signal you want to support concurrency if you don’t need it.
1) You don't need ConcurrentHashMap to make code thread-safe. That's the most extreme version that means that you need a thread-safe iterator too.
2) Locks are cheap
3) I seriously doubt that the difference between a Map and a ConcurrentHashMap is measurable in your app
Which means that both, the comments on your PRs are irrelevant and you are still going too far in your thread-safety. So you are both wrong.
What you are right about is to focus on network calls.
Locks are cheap performance-wise (or at least they can be) but they’re easy to screw up and they can be difficult to performance test.
ConcurrentHashMap has the advantage of hiding the locking from me and more importantly has the advantage of being correct, and it can still use the same Map interface so if it’s eventually used downstream somewhere stuff like `compute` will work and it will be thread safe without and work with mutexes.
The argument I am making is that it is literally no extra work to use the ConcurrentHashMap, and in my benchmarks with JMH, it doesn’t perform significantly worse in a single-threaded context. It seems silly for anyone to try and save a nanosecond to use a regular HashMap in most cases.
As soon as you have a couple of those ConcurrentHashMaps in scope with relationships between their elements, your concurrency story is screwed unless you've really thought everything out. I'd encourage using just plain HashMap to signal that there's no thread safety so you know to rethink everything if/when it comes up.