Hacker News

Could we all get bigger FPGAs and load the model onto it using the same technique?

You could [1], but it is not very cheap -- the 32GB development board with the FPGA used in the article used to cost about $16K.

[1] https://arxiv.org/abs/2401.03868

sowbug 6 hours ago [ - ]

FPGAs aren't very power-efficient. You could do it, but the numbers wouldn't add up for anything but prototyping.

fercircularbuf 16 hours ago [ - ]

I thought about this exact question yesterday. Curious to know why we couldn't, if it isn't feasible. Would allow one to upgrade to the next model without fabricating all new hardware.

wmf 16 hours ago [ - ]

FPGAs have really low density so that would be ridiculously inefficient, probably requiring ~100 FPGAs to load the model. You'd be better off with Groq.

menaerus 16 hours ago [ - ]

Not sure what you're on but I think what you said is incorrect. You can use hi-density HBM-enabled FPGA with (LP)DDR5 with sufficient number of logic elements to implement the inference. Reason why we don't see it in action is most likely in the fact that such FPGAs are insanely expensive and not so available off-the-shelf as the GPUs are.

wmf 6 hours ago [ - ]

Yeah, FPGA+HBM works but it has no advantage over GPU+HBM. If you want to store weights in FPGA LUTs/SRAM for insane speed you're going to need a lot of FPGAs because each one has very little capacity.