They're big, expensive chips with a focus on power efficiency. AMD and Intel's chips that are on the big and expensive side tend toward being optimized for higher power ranges, so they don't compete well on efficiency, while their more power efficient chips tend toward being optimized for size/cost.
If you're willing to spend a bunch of die area (which directly translates into cost) you can get good numbers on the other two legs of the Power-Performance-Area triangle. The issue is that the market position of Apple's competitors is such that it doesn't make as much sense for them to make such big and expensive chips (particularly CPU cores) in a mobile-friendly power envelope.
Per core, Apple’s Performance cores are no bigger than AMD’s Zen cores. So it’s a myth that they’re only fast and efficient because they are big.
What makes Apple silicon chips big is they bolt on a fast GPU on it. If you include the die of a discrete GPU with an x86 chip, it’d be the same or bigger than M series.
You can look at Intel’s Lunar Lake as an example where it’s physically bigger than an M4 but slower in CPU, GPU, NPU and has way worse efficiency.
Another comparison is AMD Strix Halo. Despite being ~1.5x bigger than the M4 Pro, it has worse efficiency, ST performance, and GPU performance. It does have slightly more MT.
Is it not true that the instruction decoder is always active on x86, and is quite complex?
Such a decoder is vastly less sophisticated with AArch64.
That is one obvious architectural drawback for power efficiency: a legacy instruction set with variable word length, two FPUs (x87 and SSE), 16-bit compatibility with segmented memory, and hundreds of otherwise unused opcodes.
How much legacy must Apple implement? Non-kernel AArch32 and Thumb2?
Edit: think about it... R4000 was the first 64-bit MIPS in 1991. AMD64 was introduced in 2000.
AArch64 emerged in 2011, and in taking their time, the designers avoided the mistakes made by others.
There's no AArch32 or Thumb support (A32/T32) on M-series chips. AArch64 (technically A64) is the only supported instruction set. Fun fact: this makes it impossible to run Mario Kart 8 via virtualization on Macs without software translation, since it's A32.
How much that does for efficiency I can't say, but I imagine it helps, especially given just how damn easy it is to decode.
It actually doesn't make much difference: https://chipsandcheese.com/i/138977378/decoder-differences-a...
I had not realized that Apple did not implement any of the 32-bit ARM environment, but that cuts the legs out of this argument in the article:
"In Anandtech’s interview, Jim Keller noted that both x86 and ARM both added features over time as software demands evolved. Both got cleaned up a bit when they went 64-bit, but remain old instruction sets that have seen years of iteration."
I still say that x86 must run two FPUs all the time, and that has to cost some power (AMD must run three - it also has 3dNow).
Intel really couldn't resist adding instructions with each new chip (MMX, PAE for 32-bit, many more on this shorthand list that I don't know), which are now mostly baggage.
> I still say that x86 must run two FPUs all the time, and that has to cost some power (AMD must run three - it also has 3dNow).
Legacy floating-point and SIMD instructions exposed by the ISA (and extensions to it) don't have any bearing on how the hardware works internally.
Additionally, AMD processors haven't supported 3DNow! in over a decade -- K10 was the last processor family to support it.
Oh wow, I need to dig way deeper into this but wonderful resource - thanks!
> Despite being ~1.5x bigger than the M4 Pro
Where are you getting M4 die sizes from?
It would hardly be surprising given the Max+ 395 has more, and on average, better cores fabbed with 5nm unlike the M4's 3nm. Die size is mostly GPU though.
Looking at some benchmarks:
> slightly more MT.
AMD's multicore passmark score is more than 40% higher.
https://www.cpubenchmark.net/compare/6345vs6403/Apple-M4-Pro...
> worse efficiency
The AMD is an older fab process and does not have P/E cores. What are you measuring?
> worse ST performance
The P/E design choice gives different trade-offs e.g. AMD has much higher average single core perf.
> worse GPU performance
The AMD GPU:
14.8 TFLOPS vs. M4 Pro 9.2 TFLOPS.
19% higher 3D Mark
34% higher GeekBench 6 OpenCL
Although a much crappier Blender score. I wonder what that's about.
https://nanoreview.net/en/gpu-compare/radeon-8060s-vs-apple-...
The GPUs themselves are roughly equal. However, Strix Halo is still a bigger SoC.
> TFLOPs are not the same between architectures.
Shouldn't they be the same if we are speaking about same precision? For example, [0] shows M4 Max 17 TFLOPS FP32 vs MAX+ 395 29.7 TPLOFS FP32 - not sure what exact operation was measured but at least it should be the same operation. Hard to make definitive statements without access to both machines.
[0] https://www.cpu-monkey.com/en/compare_cpu-apple_m4_max_16_cp...
M4 Max doesn't even disclose TFLOPS so no clue where that website got the numbers from.
TFLOPS can't be measured the same between generations. For example, Nvidia often quotes sparsity TFLOPS which doubles the dense TFLOPS previously reported. I think AMD probably does the same for consumer GPUs.
Another example is Radeon RX Vega 64 which had 12.7 TFLOPS FP32. Yet, Radeon RX 5700 XT with just 9.8 TFLOPS FP32 absolutely destroyed it in gaming.
What a waste of time.
"directionally correct"... so you don't know and made up some numbers? Great.
AMD doesn't "endorse benchmarks" especially not fucking Geekbench for multi-core. No-one could because it's famously nonsense for higher core counts. AMD's decade old beef with Sysmark was about pro-Intel bias.
Welcome to the world of chip discussions. I've never taken apart and M4 Pro computer and measured the die myself. It appears no one has on the internet. However, we can infer a lot of it based on previously known facts. In this case, we know M1 Pro's die size is around 250mm2.
Geekbench is the main benchmark AMD tends to use: https://videocardz.com/newz/amd-ryzen-5-7600x-has-already-be...The reason is because Geekbench correlates highly with SPEC, which is the industry standard.
Their "main benchmark"? Stop making things up. It's no more than tragic fanboy addled fraud at this point.
That three-year old press-release refers to SINGLE CORE Geekbench and not the defective multicore version that doesn't scale with core counts. Given AMD's main USP is core counts it would be an... unusual choice.
AMD marketing uses every other product under the sun too (no doubt whatever gives the better looking numbers)... including Passmark e.g. it's on this Halo Strix page:
https://www.amd.com/en/products/processors/ai-pc-portfolio-l...
So I guess that means Passmark is "endorsed" by AMD too eh? Neat.
The industry has moved past Passmark because it does not correlate to actual real world performance.
The standard is SPEC, which correlates with with Geekbench.
https://medium.com/silicon-reimagined/performance-delivered-...
Every time there is a discussion on Apple Silicon, some uninformed person always brings up Passmark, which is completely outdated.
Enough. You don't know what you are talking about.
What's with posting 5 year old medium articles about a different version of Geekbench? Geekbench 5 had different multicore scaling so if you want to argue that version was so great then you are also arguing against Geekbench 6 because they don't even match.
https://www.servethehome.com/a-reminder-that-geekbench-6-is-...
"AMD Ryzen Threadripper 3995WX, a huge 64 core/ 128 thread part, was performing at only 3-4x the rate of an Intel D-1718T quad-core part, even despite the fact it had 16x the core count and lots of other features."
"With the transition from Geekbench 5 to Geekbench 6, the focus of the Primate Labs team shifted to smaller CPUs"
GB6 measures MT the way most consumer applications use MT. GB5 was embarrassingly parallel. It reflects real world usage more.
Your source is an article based on someone finding a Geekbench result for a just released CPU and you somehow try to say its from AMD itself and its an endorsed benchmark, huh.
Those are AMD's marketing slides.
[flagged]