Hacker News

zozbot234 a day ago [ - ]

The relevant constraint when running on a phone is power, not really RAM footprint. Running the tiny E2B/E4B models makes sense, this is essentially what they're designed for.

Shawnj2 5 hours ago [ - ]

Depends on the phone, I have trouble fitting models into memory on my iPhone 13 before iOS kills the app. I imagine newer phones with more RAM don’t have this issue especially with some new flagship phones having 16+ GB of memory

trvz 19 hours ago [ - ]

It absolutely is RAM…

So much so that this was what made Apple increase their base sizes.

bigyabai 15 hours ago [ - ]

Between the GPU, NPU and big.LITTLE cores, many phones have no fewer than 4 different power profiles they can run inference at. It's about as solved as it will get without an architectural overhaul.