If every computer built in the last decade gets 1% faster and all we have to pay for that is a bit of one-off engineering effort and a doubling of the storage requirement of the ubuntu mirrors that seems like a huge win
If you aren't convinced by your ubuntu being 1% faster, consider how many servers, VMs and containers run ubuntu. Millions of servers using a fraction of a percent less energy multiplies out to a lot of energy
Don't have a clear opinion, but you have to factor in all the issues that can be due to different versions of software. Think of unexposed bugs in the whole stack (that can include compiler bugs but also software bugs related to numerical computation or just uninitialized memory). There are enough heisenbugs without worrying that half the servers run on a slightly different software.
It's not for nothing that some time ago "write once, run everywhere" was a selling proposition (not that it was actually working in all cases, but definitely working better than alternatives).
If I recompile a program to fully utilize my cpu better (use AVX or whatever) then if my program takes 1 second to execute instead of 2, it likely did not use half the _energy_.
Yes but my point is: if I download the AVX version instead of the SSE version of a package and that makes my 1000 servers 10% _quicker_ that is not the same as being 10% more _efficient_.
Because typically these modern things are a way of making the CPU do things faster by eating more power.
There may be savings from having fewer servers etc, but savings in _speed_ are not the same as savings in _power_ (and some times even work the opposite way)
> If you're negotiating deals worth billions of dollars, or even just millions, I'd strongly suggest not doing so with a hangover.
...have you met salespeople? Buying lap dances is a legitimate business expense for them. You'd be surprised how much personal rapport matters and facts don't.
In all fairness, I only know about 8 and 9 figure deals, maybe at 10 and 11 salespeople grow ethics...
It's rarely going to be worth it for an individual user, but it's very useful if you can get it to a lot of users at once. See https://www.folklore.org/Saving_Lives.html
"Well, let's say you can shave 10 seconds off of the boot time. Multiply that by five million users and thats 50 million seconds, every single day. Over a year, that's probably dozens of lifetimes. So if you make it boot ten seconds faster, you've saved a dozen lives. That's really worth it, don't you think?"
I put a lot of effort into chasing wins of that magnitude. Over a huge userbase, something like that has a big positive ROI. These days it also affects important things like heat and battery life.
The other part of this is that the wins add up. Maybe I manage to find 1% every couple of years. Some of my coworkers do too. Now you're starting to make a major difference.
They did say some packages were more. I bet some are 5%, maybe 10 or 15. Maybe more.
Well one example could be llama.cpp . It's critical for them to use every single extension the CPU has move more bits at a time.
When I installed it I had to compile it.
This might make it more practical to start offering OS packages for things like llama.cpp
I guess people that don't have newer hardware aren't trying to install those packages. But maybe the idea is that packages should not break on certain hardware.
Blender might be another one like that which really needs the extensions for many things. But maybe you so want to allow it to be used on some oldish hardware anyway because it still has uses that are valid on those machines.
I'm wondering if 'somewhat numerical in nature' relates to lpack/blas and similar libraries that are actually dependencies of a wide range of desktop applications?
blas and lapack generally do manual multi-versioning by detecting CPU features at runtime. This is more useful 1 level up the stack in things like compression/decompression, ode solvers, image manipulation and so on that are still working with big arrays of data, but don't have a small number of kernels (or as much dev time), so they typically rely on compilers for auto-vectorization
> Previous benchmarks (...) show that most packages show a slight (around 1%) performance improvement and some packages, mostly those that are somewhat numerical in nature, improve more than that.
> Are there any use cases where that 1% is worth any hassle whatsoever?
I don't think this is a valid argument to make. If you were doing the optimization work then you could argue tradeoffs. You are not, Canonical is.
Your decision is which image you want to use, and Canonical is giving you a choice. Do you care about which architecture variant you use? If you do, you can now pick the one that works best for you. Do you want to win an easy 1% performance gain? Now you have that choice.
You'll need context to answer your question, but yes there are cases.
Let's say you have a process that takes 100hrs to run and costs $1k/hr. You save an hour and $1k every time you run the process. You're going to save quite a bit. You don't just save the time to run the process, you save literal time and everything that that costs (customers, engineering time, support time, etc).
Let's say you have a process that takes 100ns and similarly costs $1k/hr. You now run in 99ns. Running the process 36 million times is going to be insignificant. In this setting even a 50% optimization probably isn't worthwhile (unless you're a high frequency trader or something)
This is where the saying "premature optimization is the root of all evil" comes from! The "premature" part is often disregarded and the rest of the context goes with it. Here's more context to Knuth's quote[0].
There is no doubt that the holy grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.
Knuth said: "Get a fucking profiler and make sure that you're optimizing the right thing". He did NOT say "don't optimize".
So yes, there are plenty of times where that optimization will be worthwhile. The percentages don't mean anything without the context. Your job as a programmer is to determine that context. And not just in the scope of your program, but in the scope of the environment you expect a user to be running on. (i.e. their computer probably isn't entirely dedicated to your program)
I'm not an hyperscaler, I run a thousand machines. If by just changing the base image I use to build - in an already automated process - those machines, well, the optimization is basically for free. Well, unless it triggers some new bug that was not there before.
nah, performance benefits are mostly wasted on consumers, because consumer hardware is very infrequently CPU-constrained. in a datacentre, a 1% improvement could actually mean you provision 99 CPUs instead of 100. but on your home computer, a 1% CPU improvement means that your network request completes 0.0001% faster, or your file access happens 0.000001% faster, and then your CPU goes back to being idle.
They forked PHP into Hack. They've diverged pretty far by this point (especially with data structures), but it maintains some of PHPs quirks and request-oriented runtime. It's jitted by HHVM. Both Hack and HHVM are open-source, but I'm not aware of any major users outside Meta.
Very few people are in the situation where this would matter.
Standard advice: You are not Google.
I'm surprised and disappointed 1% is the best they could come up with, with numbers that small I would expect experimental noise to be much larger than the improvement. If you tell me you've managed a 1% improvement you have to do a lot to convince me you haven't actually made things 5% worse.
If every computer built in the last decade gets 1% faster and all we have to pay for that is a bit of one-off engineering effort and a doubling of the storage requirement of the ubuntu mirrors that seems like a huge win
If you aren't convinced by your ubuntu being 1% faster, consider how many servers, VMs and containers run ubuntu. Millions of servers using a fraction of a percent less energy multiplies out to a lot of energy
Don't have a clear opinion, but you have to factor in all the issues that can be due to different versions of software. Think of unexposed bugs in the whole stack (that can include compiler bugs but also software bugs related to numerical computation or just uninitialized memory). There are enough heisenbugs without worrying that half the servers run on a slightly different software.
It's not for nothing that some time ago "write once, run everywhere" was a selling proposition (not that it was actually working in all cases, but definitely working better than alternatives).
That comes out to about 1.5 hours faster per week for many tasks. If you are running full tilt. But that seems like an ok easy win.
If I recompile a program to fully utilize my cpu better (use AVX or whatever) then if my program takes 1 second to execute instead of 2, it likely did not use half the _energy_.
Obviously not. But scale it out to a fleet of 1000 servers running your program continuously, you can now shut down 10 for the same exact workload.
Sure, but we're talking about compiled packages being distributed by a package manager.
Yes but my point is: if I download the AVX version instead of the SSE version of a package and that makes my 1000 servers 10% _quicker_ that is not the same as being 10% more _efficient_.
Because typically these modern things are a way of making the CPU do things faster by eating more power. There may be savings from having fewer servers etc, but savings in _speed_ are not the same as savings in _power_ (and some times even work the opposite way)
how much energy would we save if every website request weren't loaded down with 20MB of ads and analytics :(
You need 100 servers. Now you need to only buy 99. Multiply that by a million, and the economies of scale really matter.
1% is less than the difference between negotiating with a hangover or not.
What a strange comparison.
If you're negotiating deals worth billions of dollars, or even just millions, I'd strongly suggest not doing so with a hangover.
> If you're negotiating deals worth billions of dollars, or even just millions, I'd strongly suggest not doing so with a hangover.
...have you met salespeople? Buying lap dances is a legitimate business expense for them. You'd be surprised how much personal rapport matters and facts don't.
In all fairness, I only know about 8 and 9 figure deals, maybe at 10 and 11 salespeople grow ethics...
I strongly suspect ethics are inversely proportional to the size of the deal.
That's more an indictment of sales culture than a critique of computational efficiency.
Well sure, because you want the person trying buy something from you for a million dollars to have a hangover.
Sounds like someone never read Sun Tzu.
(Not really, I just know somewhere out there is a LinkedInLunatic who has a Business Philosophy based on being hungover.)
Appear drunk when you are sober, and sober when you are drunk
- Sun Zoo
A lott of improvements are very incremental. In agregate, they often compound and are vey significant.
If you would only accept 10x improvements, I would argue progress would be very small.
It's rarely going to be worth it for an individual user, but it's very useful if you can get it to a lot of users at once. See https://www.folklore.org/Saving_Lives.html
"Well, let's say you can shave 10 seconds off of the boot time. Multiply that by five million users and thats 50 million seconds, every single day. Over a year, that's probably dozens of lifetimes. So if you make it boot ten seconds faster, you've saved a dozen lives. That's really worth it, don't you think?"
I put a lot of effort into chasing wins of that magnitude. Over a huge userbase, something like that has a big positive ROI. These days it also affects important things like heat and battery life.
The other part of this is that the wins add up. Maybe I manage to find 1% every couple of years. Some of my coworkers do too. Now you're starting to make a major difference.
They did say some packages were more. I bet some are 5%, maybe 10 or 15. Maybe more.
Well one example could be llama.cpp . It's critical for them to use every single extension the CPU has move more bits at a time. When I installed it I had to compile it.
This might make it more practical to start offering OS packages for things like llama.cpp
I guess people that don't have newer hardware aren't trying to install those packages. But maybe the idea is that packages should not break on certain hardware.
Blender might be another one like that which really needs the extensions for many things. But maybe you so want to allow it to be used on some oldish hardware anyway because it still has uses that are valid on those machines.
it's very no uniform. 99% see no change, but 1% see 1.5-2x better performance
I'm wondering if 'somewhat numerical in nature' relates to lpack/blas and similar libraries that are actually dependencies of a wide range of desktop applications?
blas and lapack generally do manual multi-versioning by detecting CPU features at runtime. This is more useful 1 level up the stack in things like compression/decompression, ode solvers, image manipulation and so on that are still working with big arrays of data, but don't have a small number of kernels (or as much dev time), so they typically rely on compilers for auto-vectorization
I read it as, across the board a 1% performance improvement. Not that only 1% of packages get a significant improvement.
In a complicated system, a 1% overall benefit might well be because of a 10% improvement in just 10% of the system (or more in a smaller contributor).
The announcement is pretty clear on this:
> Are there any use cases where that 1% is worth any hassle whatsoever?
I don't think this is a valid argument to make. If you were doing the optimization work then you could argue tradeoffs. You are not, Canonical is.
Your decision is which image you want to use, and Canonical is giving you a choice. Do you care about which architecture variant you use? If you do, you can now pick the one that works best for you. Do you want to win an easy 1% performance gain? Now you have that choice.
Let's say you have a process that takes 100hrs to run and costs $1k/hr. You save an hour and $1k every time you run the process. You're going to save quite a bit. You don't just save the time to run the process, you save literal time and everything that that costs (customers, engineering time, support time, etc).
Let's say you have a process that takes 100ns and similarly costs $1k/hr. You now run in 99ns. Running the process 36 million times is going to be insignificant. In this setting even a 50% optimization probably isn't worthwhile (unless you're a high frequency trader or something)
This is where the saying "premature optimization is the root of all evil" comes from! The "premature" part is often disregarded and the rest of the context goes with it. Here's more context to Knuth's quote[0].
Knuth said: "Get a fucking profiler and make sure that you're optimizing the right thing". He did NOT say "don't optimize".So yes, there are plenty of times where that optimization will be worthwhile. The percentages don't mean anything without the context. Your job as a programmer is to determine that context. And not just in the scope of your program, but in the scope of the environment you expect a user to be running on. (i.e. their computer probably isn't entirely dedicated to your program)
[0] https://dl.acm.org/doi/10.1145/356635.356640 (alt) https://sci-hub.se/10.1145/356635.356640
Anything at scale. 1% across FAANG is huge.
If those kinds of optimizations are on the table, why would they not already be compiling and optimizing from source?
I'm not an hyperscaler, I run a thousand machines. If by just changing the base image I use to build - in an already automated process - those machines, well, the optimization is basically for free. Well, unless it triggers some new bug that was not there before.
Arguable same across consumers too. It’s just harder to measure than central datacenters
nah, performance benefits are mostly wasted on consumers, because consumer hardware is very infrequently CPU-constrained. in a datacentre, a 1% improvement could actually mean you provision 99 CPUs instead of 100. but on your home computer, a 1% CPU improvement means that your network request completes 0.0001% faster, or your file access happens 0.000001% faster, and then your CPU goes back to being idle.
an unobservable benefit is not a benefit.
Isn't Facebook still using PHP?
They forked PHP into Hack. They've diverged pretty far by this point (especially with data structures), but it maintains some of PHPs quirks and request-oriented runtime. It's jitted by HHVM. Both Hack and HHVM are open-source, but I'm not aware of any major users outside Meta.
Compiled PHP. I'm pretty sure they ran the numbers.
> some packages, mostly those that are somewhat numerical in nature, improve more than that
Perhaps if you're doing CPU-bound math you might see an improvement?
Any hyperscaler will take that 1% in a heartbeat.
Very few people are in the situation where this would matter.
Standard advice: You are not Google.
I'm surprised and disappointed 1% is the best they could come up with, with numbers that small I would expect experimental noise to be much larger than the improvement. If you tell me you've managed a 1% improvement you have to do a lot to convince me you haven't actually made things 5% worse.
No but a lot of people are buying a lot of compute from Google, Amazon and Microsoft.
At scale marginal differences do matter and compound.