You'd probably get much further along by fine tuning a small BERT style encoder model based classifier for it. IMO, even something as simple as training a linear classifier on the CLS token embeddings from a frozen encoder might work.
You'd probably get much further along by fine tuning a small BERT style encoder model based classifier for it. IMO, even something as simple as training a linear classifier on the CLS token embeddings from a frozen encoder might work.
Yeah, Ive tried a bi-encoder, cross encoder and some small LLMs so far. I think I’ll do BERT soon too
age old machine learning wisdom: start with the simplest model, then try complex ones later