Hosted models only. (my philosophy is that I need these things to be fast if I'm talking to them, and they also need to be maximally not wrong, which means cloud hosted big models even if they're expensive. To me, if it's wrong even once or if I'm sitting there waiting for it to reply, that's already making the value prop not worth it)
I think per token costs I calculated on Opus 4.5/4.6 were like $0.30/day for my text automations; $0.60/day for a few things I do that load up the browser. In general, anything browser-based munches up a lot more token (expected). What can be a bit of sticker shock is if you're having it load a lot of large web pages in a long conversation-- that can be several dollars. In the grand scheme of things, several dollars is not a lot but certainly from a "should I just go to the website myself" it tips the scale. I'm usually more interested in doing things once to "teach" it what to do (e.g. how to check a price) and then having it run that as a dialed-in cron job
Hope this helps