I'd been working with language models for several years before LLMs were a solution to this kind of problem. These are some ideas "off the top of my head" about how you can do classification in various ways. There's really a lot of ways to tackle it now, and a lot of trade-offs you can learn by experimenting with them.

There's even more options still, especially if you go further back toward more traditional methods. Static word vectors like GloVe or fasttext (optionally more modern equivalents like WordLlama or Model2Vec). Then there's sklearn-style stuff too. Those can be really small/fast but have more accuracy-level tradeoffs.