I had discounted Edge Gallery because it didn't support system prompts, but now it does so I will give it another go. I believe the implementation does use MTP since I got an update to Gemma-4-E4B on iOS indicating such, and on macOS it's very speedy.

However, on my 18GB RAM MacBook Pro, selecting Gemma-4-12B-it results in this error:

> The model "Gemma-4-12B-it' requires more memory (RAM) than is available on your device.

So yeah, my questions about the 16GB marketing copy are fair.

Interesting; they may have fluffed up somewhere then.

(Though perhaps it'll squeeze in with a small context window? Not sure I understand that aspect yet)

It does seem to use MTP, yes, and it is quite quick — seemingly the underlying LiteRT stuff can do MTP with Gemma 4 and presumably MTP is a big part of the practicality picture here.

The system prompt thing was a surprise when I poked around.