I dream of having an LLM in a box over usb bought off AliExpress for a year and change now.

The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.

Better to put that $100k in t-bills and just buy tokens even at api prices.

I understand your point (and definitely want the same), but I do have an almost-AliExpress-LLM-in-a-box: it's an Thunderbolt eGPU dock (that I got from AliE, and it is USB-C...) with a RTX 4060 Ti with 16 GB of VRAM (bought locally for gaming before the price boom)

It's been awesome for embeddings and document OCR!

3D printing a case for it is on my todo list.