Hacker News

scriptsmith a day ago [ - ]

If Chrome has the #optimization-guide-on-device-model and #prompt-api-for-gemini-nano flags enabled, either because it's part of some Origin Trial / Early Stable Release or something, then web pages will have access to the new Prompt API which allows any webpage to initiate the (one-time) download of the ~2.7 GiB CPU or ~4.0 GiB GPU model using LanguageModel.create()

https://developer.chrome.com/docs/ai/prompt-api

When Chrome 148 releases tomorrow, this will be the default behaviour on desktop.

To download, it should check for 22 GiB free disk space on the volume where your Chrome data dir is, and at least double the model size of free space in your tmp dir.

21asdffdsa12 a day ago [ - ]

First the tabs came for the RAM and i did not protest, for i had plenty. Then they came for the chip and i did not protest, for it was dark silcon anyway. Then they came for the HDD.

oaiey 19 hours ago [ - ]

And then they made the ram and ssd so expensive :)

bearjaws 19 hours ago [ - ]

I am curious if it reuses the LLM across all tabs, hard to imagine most machines can boot up 1-2 of any 4gb model unless its a more powerful system.

nsvd2 18 hours ago [ - ]

I think it obviously will, what would be the benefit to spinning up more than one copy?

wtallis 17 hours ago [ - ]

It should only need to load one copy of the weights, but each tab/site will need a separate context and KV cache.

doctorpangloss 20 hours ago [ - ]

Okay, but the browser is basically the computer for most people.

underlipton a day ago [ - ]

Told ya.

maxloh a day ago [ - ]

The more severe problem is that Google installs model weight files on a per-user basis, meaning Chrome occupies 4 more GB of space for every OS user on your device.

bityard a day ago [ - ]

The company I work at has several environments and hundreds of VDI users in each environment. Chrome is the default browser in all of them. By my rough napkin math, this one small change by Google will eat up at least 15 terabytes of new disk space in total. (I sure hope we are using deduplication at the physical storage layer...)

throwway120385 a day ago [ - ]

It's fine. Network and disk space are free, right?

charcircuit 19 hours ago [ - ]

Compared to human labor it is.

account42 6 hours ago [ - ]

Only because those who can save on the labor are not paying for the increased resource use in the first place.

tbrownaw 15 hours ago [ - ]

Shouldn't the filesystem be set to encrypt everything before it hits the physical storage layer?

dburkland 19 hours ago [ - ]

Thankfully deduplication is a thing ;)

Pay08 a day ago [ - ]

I certainly hope you don't automatically update.

TheRealDunkirk 19 hours ago [ - ]

Does your place review every line of every update patch note? Do you think you would catch this implication?

emegeve83 2 hours ago [ - ]

For every profile.

BiteCode_dev 20 hours ago [ - ]

Does each playwright (or similar automation system) count as a different user, and does it keep the model around ?

If yes, it's an interesting API to call when a AI crawler hit your website.

ai-x a day ago [ - ]

4GB, $0.10 (whatever the HD price) that is the equivalent of a High School level intelligent brain that can perform many cognitive tasks (and in the future even PhD level intelligence) for free?

Oh, the horror!!!

Wait, let me pay my HVAC guy $500 he deserved because he came all the way from his home to replace a fuse

recursive a day ago [ - ]

It doesn't make sense to apply wholesale prices for mass storage. People are running Chrome on specific devices that they already own. Storage is not fungible in this way.

beedeebeedee a day ago [ - ]

If you’re pissed you had to pay your HVAC guy to drive to your house and do something you think is trivial, why didn’t you do it yourself?

account42 5 hours ago [ - ]

As the saying goes, gp didn't pay $500 to have the fuse replaced, he paid $500 for the training and experience that was required to know that the fuse had to be replaced.

froggit 18 hours ago [ - ]

> 4GB, $0.10 (whatever the HD price) that is the equivalent of a High School level intelligent brain that can perform many cognitive tasks for free?

This is better than my current solution of an actual human with masters degreed intelligence performing all my cognitive tasks for free how? I mean, i'm the first to admit i'm extremely lazy and even i'm over here like "really??"

hansmayer 20 hours ago [ - ]

> Wait, let me pay my HVAC guy $500 he deserved because he came all the way from his home to replace a fuse

Right, because its totally something an LLM can do, right?

verisimi 21 hours ago [ - ]

Here is your google brain on your device, whether you want it or not.

mcmcmc 20 hours ago [ - ]

I don’t think you understand what “free” means

SkiFire13 21 hours ago [ - ]

Tell that to Apple, I'm sure they will allow me to pay $0.025/GB for additional storage on my Macbook /s

tayo42 21 hours ago [ - ]

It's annoyingly imposible to add more disk space to laptops. I think mine is soldered.

account42 5 hours ago [ - ]

Apple laptops maybe. In many others it's just a normal M.2 NVMe module behind a screwed on bottom case plate.

Vinnl an hour ago [ - ]

Also note the Mozilla standards position on this API: https://github.com/mozilla/standards-positions/issues/1213#i...

Or this summary on its status:

> Mozilla: Opposed

> WebKit: Opposed

> Microsoft: Several concerns

> W3C TAG: Several concerns

> Developers: Mostly negative

From https://mastodon.social/@jaffathecake/116527007495775507

sheept a day ago [ - ]

You can already trigger a 2 GB model download with the Summarizer API[0], which is already shipped in Chrome.

    Summarizer.create()

[0]: https://developer.chrome.com/docs/ai/summarizer-api#model-do...

I think this is a distinct model from the Prompt API, since the other shipped AI APIs use fine tuned models.

rafram 21 hours ago [ - ]

Both of them say they use Gemini Nano.

19 hours ago [ - ]

[deleted]

crumpled a day ago [ - ]

So now we're up to 6 GB

entropicdrifter 19 hours ago [ - ]

Per user

ddtaylor 19 hours ago [ - ]

The problem is that some of us are still on connections that charge per GB in rural areas. Here in Montana it's very common to pay about $0.25 per GB regardless of how much you use, so this is a $1 additional cost per desktop device. Places like public school districts have hundreds of computers and this will be somewhat significant for them.

marklubi 12 hours ago [ - ]

I was thinking a similar thing. Many of our customers have purpose use computers that rarely see physical infrastructure internet, but need a modern browser (many chose Chrome on their own, we never recommended it).

They're going to get blasted with cellular data charges when they fire up their computer in the field.

McGlockenshire 15 hours ago [ - ]

Google's updater service also currently ignores the windows 11 metered connection hint. It will gladly download that model over your cell connection even if you have a data cap.

This is infuriating behavior.

Silicon Valley must wake up and understand the entire world does not live like them.

wuschel a day ago [ - ]

It is a small model, so what utility can I / Google expect from it? What is the on-board model used for?

2ndorderthought a day ago [ - ]

It's not a very good small model to be honest.

That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.

Its a good idea to run small models locally if your computer can host them for privacy and cash saving reasons. But how can you trust Google to autoinstall one on your machine in 2026? I just couldn't do it.

imglorp a day ago [ - ]

Sure, local models good and yes, there's no way we can trust Google.

We can be positive the entire motivation of Chrome is user behavior surveillance. There's not a nano-chance in all the multiverses that Chrome model is doing anything privately. They've gone to extraordinary length to accomplish this. It's not for free.

reactordev a day ago [ - ]

It is entirely about user surveillance as well as pushing their product on to their users because they have the install base. Google Chrome has become Microsoft IE6 in hostile user behavior.

xnx 10 hours ago [ - ]

If Google were focused on surveillance, why haven't they been collecting keystroke data (like grammarly) for years?

aftbit a day ago [ - ]

You either die a hero or live long enough to see yourself become a villain.

What did we expect when they dropped "don't be evil" from their company values?

reactordev a day ago [ - ]

A claim about as useful then as it is now. They never wanted to be anything but, once Sergei left. The Schmidt era had them publicly declare one thing while doing something else entirely behind the curtain.

coldtea a day ago [ - ]

They were corporate evil from day 1. The rest was just PR slogans, and playing the good guy as long as you don't need to squeeze profits.

philip1209 a day ago [ - ]

Isn’t it really “pushing a feature to their products”?

reactordev a day ago [ - ]

Not when you are appropriating 2GB or more of space for that feature.

akoboldfrying a day ago [ - ]

I don't trust them either, but the same Google makes Gemma 4 available to run as locally and privately as you want, and those models are pretty amazing for their size.

imglorp a day ago [ - ]

Both can be true: they give a nice local model so you find it useful AND the chrome harness captures every token in and out for exfiltration.

marcta 17 hours ago [ - ]

LLMs are costing Google a ton of money in compute and storage right now. If they can farm any of that off to the users, it makes economical sense.

But yes, there is a 100% chance that logs will get sent back to Google too.

imglorp 4 hours ago [ - ]

> farm

Ooh, this is interesting. There's nothing stopping them from sending jobs down to local machines. That's some 3 billion nodes. We went through this with coin mining and spam botting.

Nothing stopping it except your ire if it's discovered.

Ajedi32 a day ago [ - ]

> But how can you trust Google to autoinstall one on your machine

Why are AI models something I'd be uniquely unable to trust Google to install, compared all the other code included in Chrome updates? Is your point just that you shouldn't trust Chrome in general?

2ndorderthought 19 hours ago [ - ]

Yes I would not trust Google or chrome. They have a history of class action lawsuits for doing shady things to users. Enabling them to condense data on your machine and transmit it however they want, should they choose too is suspect to me.

jm4 19 hours ago [ - ]

Google is probably still sucking up the contents of your LLM requests even with the model running locally.

jauntywundrkind a day ago [ - ]

Yeah, so unclear why yer again everyone is so quickly running for the pitchforks & torches. The model doesn't do anything, it's just a sandbox.

I'm really tired of such overinflated ridiculousness shrillness against Google. Yes there are very real tensions to this company and their as business is scary as heck.

But folks don't seem capable of processing duality, don't seem to be able to do much but ad-hominem until they pass out. Its really so exhausting having such empty energy charging in every single time, and it keeps obstructing any ability to think straight or assess.

pbmonster 20 hours ago [ - ]

I was waiting for Google to pull a local LLM onto Chrome/Android devices. It opens up some revenue streams that weren't easily possible before: for example the often memed "I was talking about cigars with my wife one single time and now all I see are adsense ads for cigars" gets much easier with a local model doing speech to text and topic classification.

froggit 18 hours ago [ - ]

> Yeah, so unclear why yer again everyone is so quickly running for the pitchforks & torches.

Cause everyone loves a good bonfire and a fresh hot roast.

jcgrillo 17 hours ago [ - ]

> The model doesn't do anything, it's just a sandbox.

Doesn't that make it worse? They forced everyone to download 4GB of crap for nothing. They could have done one of two things:

(1) bundle the model with the application so you can tell ahead of time you're signing up for 4GB of bandwidth usage or

(2) make downloading the model some kind of opt-in thing.

Either of those would have worked. Just because you can easily tolerate 4GB of unplanned bandwidth usage doesn't mean everyone who can't is wrong.

thegrim33 19 hours ago [ - ]

The point is that what you're "sick of" isn't actually authentic human thought, but in reality you're responding to a recent european-driven propaganda campaign with the goal of deriding anything and everything related to US tech.

wildrhythms a day ago [ - ]

All that matters is some MBA product manager at Google was celebrated for shipping this. Hooray!

elphinstone a day ago [ - ]

Everyone who implemented or approved this should be prosecuted under the Computer Fraud and Abuse Act (18 U.S.C. § 1030). If I was on a jury, I wouldn't hesitate to send them to prison where they belong.

ahupp 19 hours ago [ - ]

What is the principle you’re using here?

hluska 20 hours ago [ - ]

A fair and impartial jury is a fundamental part of freedom. I genuinely cannot believe that we have been reduced to wanting to destroy the jury system to punish companies we don’t agree with. At this point, this is less activism and more weaponized disrespect for fundamental freedoms.

zelphirkalt 17 hours ago [ - ]

[dead]

safety1st 21 hours ago [ - ]

> That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.

Really? I'm a total amateur when it comes to doing anything with local models but I tried a few in this range using ollama at this point, and they didn't seem to know much about anything, and I couldn't figure out how to get them to search the web or run other tools, so that was where the experiment ended.

A small local model that can use bash would be a bit of a game-changer for me.

svachalek 19 hours ago [ - ]

The latest small models are now reliable enough at simple tools like web search I think. It's just afaik none of the user friendly harnesses like ollama or LMStudio have a real one-click setup flow for this. You'll need to download models and do a fair bit of tool configuration.

xnx 10 hours ago [ - ]

Gemini CLI can use bash and run on the Gemma local model.

hluska 20 hours ago [ - ]

Local models are improving quickly so if you keep an eye open you’ll find something soon enough. But from experience, I’ll warn you that local models can lose the plot very quickly. Their little self arguments when they get stuck usually come down to:

- It failed? This must be a mistake, I’ll try it again. It failed? This must be a mistake, I’ll try it again because then I will complete the task (repeat about every six seconds until you rescue it).

- You know, the best way to deal with a permissions problem is to erase the entire system. That’ll definitely solve those pesky permissions and I’ll complete the task.

soco a day ago [ - ]

Which is why I uninstalled Chrome a (short...) while ago and my life went on unbothered.

raddan a day ago [ - ]

I am amused when people fret about not using Chrome. I get it but… I have literally NEVER used Chrome. Perhaps I just don’t know what I am missing but the web seems to work just fine for me without it?

Danox a day ago [ - ]

Touché…

tsss a day ago [ - ]

Half of the reason to use local AI is to circumvent the censorship that Google, OpenAI and so on have. I don't want this Google crap on my computer.

scriptsmith a day ago [ - ]

It's based on Gemma 3n, and it's not the best.

I find it works fine for simple classification, translation, interpretation of images & audio. It can write longer prose, but it's pretty bad.

It can also write text in the format of a JSON schema or regexp for anything you might want to do with structured data.

Wowfunhappy a day ago [ - ]

I wonder why they’re using Gemma 3 and not Gemma 4?

scriptsmith a day ago [ - ]

Google has been trialling the Prompt API in chrome for the over a year, so before Gemma 4 existed. But they are indicating they'll move to Gemma 4: https://groups.google.com/a/chromium.org/g/blink-dev/c/iR6R7...

dotancohen a day ago [ - ]

So that the big news in non-tech news sites will be the update. Thus ensuring that this is received in a positive light.

andy_ppp a day ago [ - ]

It'll probably update to that without telling you at some point.

kevincox a day ago [ - ]

I find models of this size (not tested this one specifically) at being very good at simple data extraction from user input. Think about things like parsing date and time of an event from a description or parsing a human-typed description of a repeating event rule.

rmac a day ago [ - ]

this is considered a large model. i think you might be surprised how many "small" models chrome has already pulled down on your disk.

but to answer your question: one of the services that uses a small model: PermissionsAIv4

""" Use the Permission Predictions Service and the AIv4 model to surface permission notification requests using a quieter UI when the likelihood of the user granting the permission is predicted to be low. Requires `Make Searches and Browsing Better` to be enabled. – Mac, Windows, Linux, ChromeOS, Android """

nemomarx 21 hours ago [ - ]

[dead]

michaelbuckbee a day ago [ - ]

I ran a fairly large production test of this and on _every_ measure except for privacy it was worse than a free tier server hosted LLM.

Not happy about that as I would like to see more local models but that's the current state of things.

https://sendcheckit.com/blog/ai-powered-subject-line-alterna...

gchamonlive a day ago [ - ]

> on _every_ measure except for privacy it was worse than a free tier server hosted LLM

Would you be able to compare this to other local models in it's class and a above that would fit consumer-grade hardware?

accrual a day ago [ - ]

> It is a small model, so what utility can I / Google expect from it?

Precedence for shipping models alongside consumer software.

Potentially without consent if it truly is a silent install.

hightrix a day ago [ - ]

Something to do with serving more ads. My guess is they will use this to “better target” or to drain more information from you for their ads.

tobylane a day ago [ - ]

Those two (and more) exist in chrome://flags in Chrome 147. I'm disabling them now, with the expectation that will prevent the new default.

One option I'm leaving as default is "Use LiteRT-LM runtime for on-device model service inference." Any comment on that?

RaiausderDose a day ago [ - ]

I'm on Chrome 147 too and disabled:

"optimization-guide-on-device-model"

- Enables optimization guide on device

"prompt-api-for-gemini-nano"

- Prompt API for Gemini Nano

- Prompt API for Gemini Nano with Multimodal Input

and deleted weights.bin and the 2025.x folder in "OptGuideOnDeviceModel"

Will report if Chrome 148 downloads the model again.

phs318u a day ago [ - ]

If you touch those files into existence and chown to root and chmod to 0, it shouldn’t be able to ever overwrite them right?

sethops1 19 hours ago [ - ]

You want to use chattr +i (make the empty file immutable)

pmontra a day ago [ - ]

I'm on my phone now so I can't check if something has changed, but what you want to protect from change is the directory, not the files. A file can be deleted and created again if the process can write the directory.

RaiausderDose a day ago [ - ]

yeah, should work. Will try readonly on windows too.

Now I can't see it anymore, but shouldn't the model be under chrome://on-device-internals/ -> model-status?

Maybe you can uninstall there too.

beaugunderson a day ago [ - ]

maybe I was on the wrong side of the early release but I’ve deleted this model many times in the last year. I’ve had it for at least 12 months.

Markoff a day ago [ - ]

thanks, went to flags in Vivaldi and just in case disabled all flags containing "gemini" and first five results for "model"

scriptsmith a day ago [ - ]

Those flags will exist already, but will default to enabled in 148.

That other flag is for using a different open-source inference engine to the (from what I can tell) closed-source one that's used by default.

dpoloncsak a day ago [ - ]

[dead]

d3Xt3r 18 hours ago [ - ]

So my understanding of that is that the download happens only when sites call the Prompt API right?

Because my Chrome stable has been updated to v148 now, and I don't see any AI models in my user profile folder. My profile size is only 328 MB, with the Code Cache subfolder occupying the most space (135 MB).

scriptsmith 17 hours ago [ - ]

In my understanding, yes. I wrote a blog post about some of the internals here: https://news.ycombinator.com/item?id=48028662

Twirrim 17 hours ago [ - ]

Searching about:flags for model comes up with a whole bunch:

#omnibox-ml-url-scoring-model

#omnibox-on-device-tail-suggestions

#optimization-guide-on-device-model

#text-safety-classifier

#prompt-api-for-gemini-nano

#writer-api-for-gemini-nano

#rewriter-api-for-gemini-nano

#proofreader-api-for-gemini-nano

#summarizer-api-for-gemini-nano

#on-device-model-litert-lm-backend

Then around gemini but not caught by the search for models: #skills (maybe? I think this is implied by "gemini in chrome"?)

edit: I don't see a carte blanch AI disabling option. As much as I dislike Mozilla's growing obsession with AI, at least they give me a top level option to disable all AI stuff. I only keep Chrome around for occasional testing reasons.

scriptsmith 10 hours ago [ - ]

I wrote a more detailed blog post here:

https://news.ycombinator.com/item?id=48028662

codethief 20 hours ago [ - ]

Next step: Invoke the prompt API from within online ads and run a "p2p" AI inference provider which forwards incoming LLM queries to website visitors. :-)

jimmaswell a day ago [ - ]

This sounds perfectly reasonable. No objection from me.

madduci 7 hours ago [ - ]

Do you know if also Chromium has thesenfkags enabled?

scriptsmith 7 hours ago [ - ]

Depends on where you get it. By default the flags will be enabled, but some packagers may choose to disable them. I haven't seen a major distro release chromium 148 yet.

Weirdly though, chromium won't be able to actually use the model even though it can download it, because the inference engine is a closed-source blob.

https://adsm.dev/posts/prompt-api/#which-browsers-support-th...

jadbox 15 hours ago [ - ]

I believe webpages that use the API must request from the user via a system permissions dialogue to aces the prompt API, according the docs a few months ago.

scriptsmith 15 hours ago [ - ]

It can only be called after the user has interacted with the page, but there's no dialogue from the browser

https://developer.chrome.com/docs/ai/get-started#user-activa...

BergAndCo a day ago [ - ]

[dead]