Hacker News

Still, it is technically correct. The model produces a next-token likelihood distribution, then you apply a sampling strategy to produce a sequence of tokens.

KeplerBoy 3 days ago [ - ]

Depends on your definition of the model. Most people would be pretty upset with the usual LLM providers if they drastically changed the sampling strategy for the worse and claimed to not have changed the model at all.

blackbear_ 3 days ago [ - ]

Tailoring the message to the audience is really a fundamental principle of good communication.

Scientists and academics demand an entirely different level of rigor compared to customers of LLM providers.

KeplerBoy 3 days ago [ - ]

Sure, but they went slightly overboard with that headline and they knew it. But oh well, they have a lot of eyes and discussion on their paper so it's a success.

mattlutze 3 days ago [ - ]

I feel like, if the feedback to your paper is "this is over-done / they claim more than they prove / it's kinda hype-ish" you're going to get less references in future papers.

That would seem to be counter to the "impact" goal for research.

KeplerBoy 3 days ago [ - ]

Fair enough, that might be more my personal opinion instead of sound advice for successful research. Also I understand that you have a very limited amount of time to get your research noticed in this topic. Who knows if it's relevant two years down the line.

Der_Einzige 3 days ago [ - ]

LLM providers are in the stone age with sampling today and it's on purpose because better sampling algorithms make the diversity of synthetic generated data too high, thus meaning your model is especially vulnerable to distillation attacks.

This is why we use top_p/top_k on the big 3 closed source models despite min_p and far better LLM sampling algorithms existing since 2023 (or in TFS case, since 2019)

lou1306 3 days ago [ - ]

So, scientists came up with a very specific term for a very specific function, its meaning got twisted by commercial actors and the general public, and it's now the scientists' fault if they keep using it in the original, very specific sense?

fxwin 3 days ago [ - ]

I agree it is technically correct, but I still think it is the research paper equivalent of clickbait (and considering enough people misunderstood this for them to issue a semi-retraction that seems reasonable)

lou1306 3 days ago [ - ]

I disagree. Within the research community (which is the target of the paper), that title means something very precise and not at all clickbaity. It's irrelevant that the rest of the Internet has an inaccurate notion of "model" and other very specific terms.

fxwin 3 days ago [ - ]

In a field with as much public visibility as this one it is naive to only think of the academic target audience, especially when choosing a title like this. As a researcher you are responsible for communicating your findings both to other experts and to outsiders, and that includes choosing appropriate titles. (Though i think we fundamentally disagree about the role of researchers here) It's like writing a title that says "drinking only 200ml of water a day leads to weight loss" which is technically true, but misleading.