Any resources you can share for these experimental builds? This is something I was looking into setting up at some point. I'd love to take a look at examples in the wild to gauge if it's worth my time / money.

An aside, if we ever reach a point where it's possible to run an OSS 20b model at reasonable inference on a Macbook Pro type of form factor, then the future is definitely here!

In reference to this post i saw a few weeks ago:

https://lemmy.zip/post/50193734

(Lemmy is a reddit style forum)

The author mainly demos their "custom tools" and doesn't elaborate further. But IMO is still an impressive showcase for an offline setup.

I think the big hint is "open webui" which supports native function calls.

Some more searching and i found this: https://pypi.org/project/llm-tools-kiwix/

It's possible the future is now.. assuming you have an M series with enough RAM. My sense is that you need ~1gb of RAM for every 1b paramters, so 32gb should in theory work here. I think macs also get a performance boost over other hardware due to unified memory.

Spit balling aside, I'm in the same boat, saving my money, waiting for the right time. If it isn't viable already its damn close.