Thanks man this is incredible work, really appreciate the details you went into.
I've been chewing on if there was a miracle that could make embeddings 10x faster for my search app that uses minilmv3, sounds like there is :) I never would have dreamed. I'll definitely be trying potion-base in my library for Flutter x ONNX.
EDIT: I was thanking you for thorough benchmarking, then it dawned on me you were on the team that built the model - fantastic work, I can't wait to try this. And you already have ONNX!
EDIT2: Craziest demo I've seen in a while. I'm seeing 23x faster, after 10 minutes of work.
Thanks so much for the kind words, that's awesome to hear! If you have any ideas or requests, don't hesitate to reach out!