Maybe, but due to the physics of signal integrity, socketed RAM will always be slower than RAM integrated onto the same PCB as whatever processing element is using it, so by the time CAMM / LPCAMM catches up, some newer integrated RAM solution will be faster yet.
This is a matter of physics. It can't be "fixed." Signal integrity is why classic GPU cards have GiBs of integrated RAM chips: GPUs with non-upgradeable RAM that people have been happily buying for years now.
Today, the RAM requirements of GPU and their applications has become so large that the extra, low cost, slow, socketed RAM is now a false economy. Naturally, therefore, it's being eliminated as PCs evolve into big GPUs, with one flavor or other of traditional ISA processing elements attached.
It’s possible that Apple really did a disservice to soldered RAM by making it a key profit-increasing option for them, exploiting the inability of buyers to buy RAM elsewhere or upgrade later, but in turn making soldered RAM seem like a scam, when it does have fundamental advantages, as you point out.
Going from 64 GB to 128 GB of soldered RAM on the Framework Desktop costs €470, which doesn’t seem that much more expensive than fast socketed RAM. Going from 64 GB to 128 GB on a Mac Studio costs €1000.
Ask yourself this: what is the correct markup for delivering this nearly four years before everyone else? Because that's what Apple did, and why customers have been eagerly paying the cost.
Let us all know when you've computed that answer. I'll be interested, because I have no idea how to go about it.
Is the problem truly down to physics or is it down to the stovepiped and conservative attitudes of PC part manufacturers and their trade groups like JEDEC? (Not that consumers don't play a role here too).
The only essential part of sockets vs solder is the metal-metal contacts. The size of the modules and the distance from the CPU/GPU are all adjustable parameters if the will exists to change them.
Yes. The "conservative attitudes" of JEDEC et al. are a consequence of physics and the capabilities of every party involved in dealing with it, from the RAM chip fabricators and PCB manufacturers, all the way to you, the consumer, and the price you're willing to pay for motherboards, power supplies, memory controllers, and yield costs incurred trying to build all of this stuff, such that you can sort by price, mail order some likely untested combination of affordable components and stick them together with a fair chance that it will all "work" within the power consumption envelope, thermal envelope, and failure rate you're likely to tolerate. Every iteration of the standards is another attempt to strike the right balance all the way up and down this chain, and at the root of everything is the physics of signal integrity, power consumption, thermals and component reliability.
As I said, consumers play a part here too. But I don't see the causal line from the physics to the stagnation, stovepiping, artificial market segmentation, and cartelization we see in the computer component industries.
Soldering RAM has always been around and it has its benefits. I'm not convinced of its necessity however. We're just now getting a new memory socket form factor but the need was emerging a decade ago.
> The only essential part of sockets vs solder is the metal-metal contacts.
Yeah... And that’s a pretty damn big difference. A connector is always going to result in worse signal integrity than a high-quality solder joint in the real world.
No doubt the most tightly integrated package can outperform a looser collection of components. But if we could shorten the distances, tighten the tolerances, and have the IC companies work on improving the whole landscape instead of just narrow, disjointed pieces slowly one at a time, then would the unsoldered connections still cause a massive performance loss or just a minor one?
Yes. Signal integrity is so finicky at frequencies DRAM operates that whether you drill the plated holes on boards that complete the circuit to go completely through the board or stop it halfway starts to matter due to signals permeating into the stubs of the holes and reflecting back into the trace causing interference. Adding a connector between RAM and CPU is like extending that long pole in the tent in the middle by inserting a stack of elephant into what is already shaped like an engine crankshaft found in a crashed wreck of a car.
Besides, no one strictly need mid-life upgradable RAMs. You're just wanting to be able to upgrade RAM later after purchase because it's cheaper upfront and also because it leaves less room for supply side for price gouging. Those aren't technical reasons you can't option a 2TB RAM on purchase and be done for 10 years.
In the past, at least, RAM upgrades weren't just about filling in the slots you couldn't afford to fill on day one. RAM modules also got denser and faster over time too. This meant you could add more and better RAM to your system after waiting a couple years than it was even physically possible to install upfront.
Part of the reason I have doubts about the physical necessity here is because PCI Express (x16) is roughly keeping up with GDDR in terms of bandwidth. Of course they are not completely apples-to-apples comparable, but it proves at least that it's possible to have a high-bandwidth unsoldered interface. I will admit though that what I can find indicates that signal integrity is the biggest issue each new generation of PCIe has to overcome.
It's possible that the best solution for discrete PC components will be to move what we today call RAM onto the CPU package (which is also very likely to become a CPU+GPU package) and then keep PCIe x16 around to provide another tier of fast but upgradeable storage.
I am personally dealing with PCIe signal integrity issues at work right now, so I can say yes, it’s incredibly finicky once you start going outside of the simple “slot below CPU” normal situation. And I only care about Gen 3 speeds right now.
But in general yes, PCIe vs RAM bandwidth is like comparing apples to watermelons. One’s bigger than the other and they’re both fruits, but they’re not the same thing.
Generally people don’t talk about random-access PCIe latency because it generally doesn’t matter. You’re looking at a best-case 3x latency penalty for PCIe vs RAM, usually more like an order of magnitude or more. PCIe is really designed for maximum throughput, not minimum latency. If you make the same tradeoffs with RAM you can start tipping the scale the other way - but people really care about random access latency in RAM (almost like it’s in the name) so that generally doesn’t happen outside of specific scenarios. 500ns 16000MT/s RAM won’t sell (and would be a massive pain - you’d probably need to 1.5x bus width to achieve that, which means more pins on the CPU, which means larger packages, which means more motherboard real estate taken and more trace length/signal integrity concerns, and you’d need to somehow convince everyone to use your new larger DIMM...).
You can also add more memory channels to effective double/quadruple/sextuple memory bandwidth, but again, package constraints + signal integrity increases costs substantially. My threadripper pro system does ~340GB/s and ~65ns latency (real world) with 8 memory channels - but the die is huge, CPUs are expensive as hell, and motherboards are also expensive as hell. And for the first ~9 months after release the motherboards all struggled heavily with various RAM configurations.
Perhaps it's time to introduce L4 Cache and a new Slot CPU design where RAM/L4 is incorporated into the CPU package? The original Slot CPUs that Intel and AMD released in the late 90s were to address similar issues with L2 cache.
Intel's Arrow Lake platform launched in fall 2024 is the first to support CUDIMMs (clock redriver on each memory module) and as a result is the first desktop CPU to officially support 6400MT/s without overclocking (albeit only reaching that speed for single-rank modules with only one module per channel). Apple's M1 Pro and M1 Max processors launched in fall 2021 used 6400MT/s LPDDR5.
Intel's Lunar Lake low-power laptop processors launched in fall 2024 use on-package LPDDR5x running at 8533MT/s, as do Apple's M4 Pro and M4 Max.
So at the moment, soldered DRAM offers 33% more bandwidth for the same bus width, and is the only way to get more than a 128-bit bus width in anything smaller than a desktop workstation.
Smartphones are already moving beyond 9600MT/s for their RAM, in part because they typically only use a 64-bit bus with. GPUs are at 30000MT/s with GDDR7 memory.
Maybe, but due to the physics of signal integrity, socketed RAM will always be slower than RAM integrated onto the same PCB as whatever processing element is using it, so by the time CAMM / LPCAMM catches up, some newer integrated RAM solution will be faster yet.
This is a matter of physics. It can't be "fixed." Signal integrity is why classic GPU cards have GiBs of integrated RAM chips: GPUs with non-upgradeable RAM that people have been happily buying for years now.
Today, the RAM requirements of GPU and their applications has become so large that the extra, low cost, slow, socketed RAM is now a false economy. Naturally, therefore, it's being eliminated as PCs evolve into big GPUs, with one flavor or other of traditional ISA processing elements attached.
It’s possible that Apple really did a disservice to soldered RAM by making it a key profit-increasing option for them, exploiting the inability of buyers to buy RAM elsewhere or upgrade later, but in turn making soldered RAM seem like a scam, when it does have fundamental advantages, as you point out.
Going from 64 GB to 128 GB of soldered RAM on the Framework Desktop costs €470, which doesn’t seem that much more expensive than fast socketed RAM. Going from 64 GB to 128 GB on a Mac Studio costs €1000.
Ask yourself this: what is the correct markup for delivering this nearly four years before everyone else? Because that's what Apple did, and why customers have been eagerly paying the cost.
Let us all know when you've computed that answer. I'll be interested, because I have no idea how to go about it.
I had 128gb of ram in my desktop from nearly a decade ago. I'm not sure what exactly Apple invented here.
Yeah, it's not really about jamming more DIMMs into more sockets.
Of course it isn't... the point stands... Apple didn't actually invent anything in that regard.
Is the problem truly down to physics or is it down to the stovepiped and conservative attitudes of PC part manufacturers and their trade groups like JEDEC? (Not that consumers don't play a role here too).
The only essential part of sockets vs solder is the metal-metal contacts. The size of the modules and the distance from the CPU/GPU are all adjustable parameters if the will exists to change them.
> Is the problem truly down to physics
Yes. The "conservative attitudes" of JEDEC et al. are a consequence of physics and the capabilities of every party involved in dealing with it, from the RAM chip fabricators and PCB manufacturers, all the way to you, the consumer, and the price you're willing to pay for motherboards, power supplies, memory controllers, and yield costs incurred trying to build all of this stuff, such that you can sort by price, mail order some likely untested combination of affordable components and stick them together with a fair chance that it will all "work" within the power consumption envelope, thermal envelope, and failure rate you're likely to tolerate. Every iteration of the standards is another attempt to strike the right balance all the way up and down this chain, and at the root of everything is the physics of signal integrity, power consumption, thermals and component reliability.
As I said, consumers play a part here too. But I don't see the causal line from the physics to the stagnation, stovepiping, artificial market segmentation, and cartelization we see in the computer component industries.
Soldering RAM has always been around and it has its benefits. I'm not convinced of its necessity however. We're just now getting a new memory socket form factor but the need was emerging a decade ago.
> The only essential part of sockets vs solder is the metal-metal contacts.
Yeah... And that’s a pretty damn big difference. A connector is always going to result in worse signal integrity than a high-quality solder joint in the real world.
Is that really the long pole in the tent, though?
No doubt the most tightly integrated package can outperform a looser collection of components. But if we could shorten the distances, tighten the tolerances, and have the IC companies work on improving the whole landscape instead of just narrow, disjointed pieces slowly one at a time, then would the unsoldered connections still cause a massive performance loss or just a minor one?
Yes. Signal integrity is so finicky at frequencies DRAM operates that whether you drill the plated holes on boards that complete the circuit to go completely through the board or stop it halfway starts to matter due to signals permeating into the stubs of the holes and reflecting back into the trace causing interference. Adding a connector between RAM and CPU is like extending that long pole in the tent in the middle by inserting a stack of elephant into what is already shaped like an engine crankshaft found in a crashed wreck of a car.
Besides, no one strictly need mid-life upgradable RAMs. You're just wanting to be able to upgrade RAM later after purchase because it's cheaper upfront and also because it leaves less room for supply side for price gouging. Those aren't technical reasons you can't option a 2TB RAM on purchase and be done for 10 years.
In the past, at least, RAM upgrades weren't just about filling in the slots you couldn't afford to fill on day one. RAM modules also got denser and faster over time too. This meant you could add more and better RAM to your system after waiting a couple years than it was even physically possible to install upfront.
Part of the reason I have doubts about the physical necessity here is because PCI Express (x16) is roughly keeping up with GDDR in terms of bandwidth. Of course they are not completely apples-to-apples comparable, but it proves at least that it's possible to have a high-bandwidth unsoldered interface. I will admit though that what I can find indicates that signal integrity is the biggest issue each new generation of PCIe has to overcome.
It's possible that the best solution for discrete PC components will be to move what we today call RAM onto the CPU package (which is also very likely to become a CPU+GPU package) and then keep PCIe x16 around to provide another tier of fast but upgradeable storage.
I am personally dealing with PCIe signal integrity issues at work right now, so I can say yes, it’s incredibly finicky once you start going outside of the simple “slot below CPU” normal situation. And I only care about Gen 3 speeds right now.
But in general yes, PCIe vs RAM bandwidth is like comparing apples to watermelons. One’s bigger than the other and they’re both fruits, but they’re not the same thing.
Generally people don’t talk about random-access PCIe latency because it generally doesn’t matter. You’re looking at a best-case 3x latency penalty for PCIe vs RAM, usually more like an order of magnitude or more. PCIe is really designed for maximum throughput, not minimum latency. If you make the same tradeoffs with RAM you can start tipping the scale the other way - but people really care about random access latency in RAM (almost like it’s in the name) so that generally doesn’t happen outside of specific scenarios. 500ns 16000MT/s RAM won’t sell (and would be a massive pain - you’d probably need to 1.5x bus width to achieve that, which means more pins on the CPU, which means larger packages, which means more motherboard real estate taken and more trace length/signal integrity concerns, and you’d need to somehow convince everyone to use your new larger DIMM...).
You can also add more memory channels to effective double/quadruple/sextuple memory bandwidth, but again, package constraints + signal integrity increases costs substantially. My threadripper pro system does ~340GB/s and ~65ns latency (real world) with 8 memory channels - but the die is huge, CPUs are expensive as hell, and motherboards are also expensive as hell. And for the first ~9 months after release the motherboards all struggled heavily with various RAM configurations.
> The only essential part of sockets vs solder is the metal-metal contacts
And at GHz speeds that matters more than you may think.
Perhaps it's time to introduce L4 Cache and a new Slot CPU design where RAM/L4 is incorporated into the CPU package? The original Slot CPUs that Intel and AMD released in the late 90s were to address similar issues with L2 cache.
How much higher bandwidth, percentage wise, can one expect from integrated DRAM vs socketed DRAM? 10%?
Intel's Arrow Lake platform launched in fall 2024 is the first to support CUDIMMs (clock redriver on each memory module) and as a result is the first desktop CPU to officially support 6400MT/s without overclocking (albeit only reaching that speed for single-rank modules with only one module per channel). Apple's M1 Pro and M1 Max processors launched in fall 2021 used 6400MT/s LPDDR5.
Intel's Lunar Lake low-power laptop processors launched in fall 2024 use on-package LPDDR5x running at 8533MT/s, as do Apple's M4 Pro and M4 Max.
So at the moment, soldered DRAM offers 33% more bandwidth for the same bus width, and is the only way to get more than a 128-bit bus width in anything smaller than a desktop workstation.
Smartphones are already moving beyond 9600MT/s for their RAM, in part because they typically only use a 64-bit bus with. GPUs are at 30000MT/s with GDDR7 memory.