Hey Simon,
Elliott here from Cohere.
We benchmarked against Nomic's models on our consortium of datasets ranging from text-only, image-only, and mixed modalities. Without publishing additional benchmarks, I am confident in saying that our model is more performant.
At Cohere, for our embed models, we have not deprecated any of our embedding models since we started (I know because I've been there that long) and if we were to start doing so, I would take into account the worry of ensuring our users have a way of accessing our models.
One aspect here that isn't factored is also efficiency. Yes there might be strong open weight models but if you're punching at the 7bn+ weight class your serving requirements are vastly different from a throughput efficiency perspective (also your query-inference speed).
All food for thought. That being said, if for your use-case, Nomic Embed Vision 1.5 is better than Embed-v4.0, happy to hop on a call to discuss where the differential may be.
I don't doubt the new Cohere model is better - but one of the features I value most from an embedding model is having an escape hatch, so I can continue to calculate vectors using that same model far into the future if something happens to the hosting provider.
This matters for embedding models because I'm presumably building up a database of many millions of vectors for later similarity comparisons - so I need to know I'll be able to embed an arbitrary string in the future in order for that investment to still make sense.
Size doesn't matter much to me, I don't even need to be able to run that model, it's more about having an insurance policy for my own peace of mind.
(Even a covenant that says "in the event that Cohere goes out of business this model will be made available under license X" would address this itch for me.)
I'll start off with, I'm not one of our founders and REALLY wouldn't want to be publicly held accountable for policies or commitments until I've been able to get internal alignment on things I say.
That being said, since I do manage our Search and Retrieval offering, if we were to deprecate any of our embedding models (which is generally the risk of closed-source models), I will make sure that there is an "escape hatch" for users.
Heard on what your concerns are though :)
To someone building a long term dataset, I’m not sure what assurances would help. Certainly a personal assurance doesn’t (though you’re kind to offer), and even a corporate statement doesn’t (new owners or C-suite could walk that back anytime). It might take a formal third-party “model escrow” arrangement to be really convincing.
Hey All,
Thanks for engaging! Apologies for the delay but HN seems to have throttled my account from posting so I'm answering as fast as I can (or they will let me).
You're right in the sense that I could wake up tomorrow and Cohere could lay me off, fire me, or I could quit! All of these are possible statements, but the reason I don't want to publicly commit particularly on our policy on Open Sourcing our models if our business is a going concern or if we deprecate our models is as follows:
1) Cohere is not a going concern 2) I haven't thought about deprecating any of our embedding models because of the reason that simonw stated!
I wouldn't say I'm a Cohere employee playing PR - I'm responsible for all the search and embedding models and products at Cohere and I care deeply on how our users perceive, understand and user our models/products. I'm actually really excited that there is so much engagement this time around (a far cry from 2021 - when I started).
For reference in terms of policies: For our SaaS API, I wrote our model deprecation policy (https://docs.cohere.com/docs/deprecations) and had only deprecated our Rerank-v2.0 Models largely because they were stateless
Again - happy for all the engagement. Heard on the things we can improve on!
This is a little off-topic and nitpicky, so I waited a day to avoid cluttering comments while the thread was on the front page..
I believe the term "going concern" means exactly the opposite of what you were trying to say here. Generally, comments about pedantry aren't helpful or uninteresting. This case was amusing to me in the context of assuring people Cohere is likely to stay around by boldly stating Cohere is at risk of being insolvent or ceasing operations ("Cohere is not a going concern"). Beyond that, I think it's pretty interesting how understandable it is to look at the term without knowing its meaning and assume the presence of the word "concern" must mean people are concerned about it going [bankrupt?].
I'm sure given the context nobody got the wrong impression. If anything, it makes me wonder if the term could, at least in informal contexts, reach a point of semantic inversion.
> I'm not one of our founders and REALLY wouldn't want to be publicly held accountable for policies or commitments
I don't mean to phrase this in a hostile way, but then what is even the point of posting? Your word means nothing. You are not in a position to promise anything. You could wake up one morning and find yourself laid off with all your accounts terminated.
And the fact that a Cohere employee is playing PR trying to deflect this issue gives me less faith, not more.
I can claim that my car is able to fly. That does not mean pressing the gas pedal makes it generate lift.
What a disingenuous comparison. The contention here is organizational politics, not physics.
Hey Elliot,
Andriy, co-founder at Nomic here! Congrats on Embed v4 - the more embeddings the merrier!
Embed v1.5 is a 1.5 year old model!
You should check out our latest comparable open-weights, multimodal embedding model that's designed for text, PDFs and images! I can't directly say anything about relative performance to Embed v4 as you guys didn't publish evals on the Vidore-V2 open benchmark!
https://www.nomic.ai/blog/posts/nomic-embed-multimodal
Hey Andriy!
We actually did internally run benchmarks against your models since they are open-weights - however, when looking at the license on the 3bn multimodal model (https://huggingface.co/nomic-ai/nomic-embed-multimodal-3b/bl...) we're not permitted to include the results for the marketing of products/services. Rest assured, we know how our model stacks up against yours :)
In any-case, we didn't publish evals on only Vidore-V2 but we did benchmark on it internally.