I find myself fixing the spec of my software often, and that makes lots of existing code obsolete. Creating working code is getting cheaper with AI, but creating great specs which make the software easy and intuitive to use seems to be more difficult for AI.
Why? Because you have to actually use the product to discover what is wrong, or sub-optimal, with it.
Yeah exactly, I literally don’t know how to change my spec until I’ve gathered more data.
I was building a transaction classifier recently and I initially thought it would be a trivial “solved” problem. Throw transactions into a tiny local LLM, let it classify. But that approach was too slow, and not accurate enough. I didn’t know that though until I tried and then needed to change the spec.
You'd probably get much further along by fine tuning a small BERT style encoder model based classifier for it. IMO, even something as simple as training a linear classifier on the CLS token embeddings from a frozen encoder might work.
Yeah, Ive tried a bi-encoder, cross encoder and some small LLMs so far. I think I’ll do BERT soon too
age old machine learning wisdom: start with the simplest model, then try complex ones later
So wait ... you're not even going to train based on what you want, just "throw into"? Did you actually put in work on a very clear and accurate prompt with a full manual on what to do?
Throwing a tiny little LLM at it helped me assess that it was far too slow for me to reasonably use at the scale I needed. So it didn’t really matter how accurate the prompt was. I was more just pointing out that I didn’t know if would be too slow without trying it. I maybe could have done some simple math in retrospect, but trying it out was easy enough
Not every lottery winner has a detailed strategy.