Remember how 16GBs used to be an enterprise level database mainframe?
Well, GPUs also have stupid amounts of compute on them. I have to imagine that there is some kind of database format that's useful with GPU compute attached.
Since the data is already in VRAM, the GPU can sort, join, or otherwise manipulate data as needed.
GPU-accelerated databases have a long history. I founded HeavyAI (previously MapD/OmniSci) in 2013, but there are or have been many other startups in this space, such as Voltron Data, Kinetica, Sqream, etc. And now you have major players like IBM, Starburst, and Microsoft (which just announced Fabric SQL on GPU today) working on their own GPU-accelerated systems. GPUs have a huge advantage in terms of compute, memory, and interconnect bandwidth over CPU, as long as you can keep them fed with data.
I believe within 2-3 years databases and data warehouses on GPU will be common. The widespread use of agents to query data will be a part of this, as there will be a need to run far more queries at lower latency than needed for the ETL and BI workloads of the past.
And smart NICs are moving significant amounts of compute directly onto the network interface, though I haven't seen anyone combining a GPU and a 100GbE NIC into a single part yet.
Where does a few more steps of evolution take us? A wide path between a few heavy devices, and then the CPU off to the side just orchestrating the data flow?
Insightful take, looking into these
oh god please don't create more demand for GPUs
Can we somehow make them work with 1 TB PCIes so we can churn through way more data?
Have you heard of the "Radeon Pro SSG" ??
It must have failed because I never heard of an update to this GPU. But AMD definitely made a GPU with 4x NVMe SSDs attached to the GPU.
You are able to use GPU Direct Storage to communicate between the GPU and PCIE storage devices. It's nice, but it's not typically as performant as one would like, in comparison to the onboard memory.
https://docs.nvidia.com/gpudirect-storage/
https://github.com/microsoft/DirectStorage/tree/main
linux has P2P-DMA for this. The drivers, devices and bus topology need to support it though.
https://docs.kernel.org/driver-api/pci/p2pdma.html
I think GP means 1TB of PCIe bandwidth, instead of 1TB of PCIe NVMe drives.
Possibly LSM compaction.