Oh man you just gave me an idea to use something like qwen 3.5 to categorize a lot of emails. You can keep the context small, do it per email and just churn through a lot of crap.
Oh man you just gave me an idea to use something like qwen 3.5 to categorize a lot of emails. You can keep the context small, do it per email and just churn through a lot of crap.
The 0.8B can do this pretty well.
Actually pg's original "A plan for spam" explains how to do this with a Bayesian classifier.
I was just chatting with a co-worker that wanted to run a LLM locally to classify a bunch of text. He was worried about spending too many tokens though.
I asked him why he didn't just have the LLM build him a python ML library based classifier instead.
The LLMs are great but you can also build supporting tools so that:
- you use fewer tokens
- it's deterministic
- you as the human can also use the tools
- it's faster b/c the LLM isn't "shamboozling" every time you need to do the same task.
I use Haiku to classify my mail - it's way overkill, but also doesn't require training unlike a classifer. I recieve many dozens of e-mails a day, and it's burned on average ~$3 worth of tokens per month. I'll probably switch that to a cheaper model soon, but it's cheap enough the "payoff" from spending the time optimizing it is long.
I've been learning to apply these lately and it has been pretty eye opening. Combined with Fourier analysis (for example) you can do what seems kind of like magic, in my opinion. But it has been possible since long before LLMs showed up.
Totally different categories and different use cases, but the more I learn about LLMs the more I discover there's a powerful, determinsitic, well-established statistical model or two to do the same thing.
Really, LLMs are kind of like convenient, wildly inefficient proxies for useful processes. But I'm not convinced they should often end up as permanent fixtures of logical pipelines. Unless you're making a chat bot, I guess.
> Really, LLMs are kind of like convenient, wildly inefficient proxies for useful processes. But I'm not convinced they should often end up as permanent fixtures of logical pipelines. Unless you're making a chat bot, I guess.
I think I agree with this. It's made me realise LLMs are great for prototyping processes in the same way that 3D printers are great at prototyping physical things. They make it quick and easy to get something close enough to see the unforeseen problems a proper solution might have.
3d printing is a great analog because there are so many critical considerations that are often missed or can't be accounted for in the prototype, but, it's alright because it's a prototype. The strain testing, durability, manufacturing at scale; none of that is properly addressed. Those might involved some serious, expensive challenges, too. But it's alright because you've got something in your hand that informs you whether or not those challenges are worth contending with. I really love this about LLMs and 3d printing.
you can use 4B for that, its quite good