> that ends up looking kind of like a crawling or scraping/search operation
Sure, but what I'm talking about is that the current SOTA models are terrible even for specialized small use cases like what you describe, so you can't just throw a local modal on that task and get useful sessions out of them that you can use for fine-tuning. If you want distilled data or similar, you (obviously) need to use a better model, but currently there is none that provides the privacy-guarantees I need, as described earlier.
All of those things come once you have something suitable for the individual pieces, but I'm trying to say that none of the current local models come close to solving the individual pieces, so all that other stuff is just distraction before you have that in place.
Understood. I guess I'm saying "soon" but definitely agreed its not "now" yet. I will say though, with 96GB, in a couple months you're going to be able to hold tons of Gemma 4 LoRa "specialists" in-memory at the same time and I really think it will feel like a whole new world once these are all getting trained and shared and adapted en-masse. And also, you could set up personal traces now if you want. Nobody can make you, but in its laziest form it can be literally just taking screenshots of your screen periodically as you work, and that'll have applications soon
> And also, you could set up personal traces now if you want. Nobody can make you, but in its laziest form it can be literally just
But again, you're missing my point :) I cannot, since the models I could generate useful traces from are run by platforms I'm not willing to hand over very private data to, and local models that I could use I cannot get useful traces from.
And I'm not holding out hope for agent orchestration, people haven't even figured out how to reliably get high quality results from agent yet, even less so with a fleet of them. Better to realistically tamper your expectations a bit :)