An experiment is not a proof.

If this is the level of one of the contributors to the OpenTSLM paper (which you very obviously are), no wonder due diligence wasn't done properly.

It’s less about proof and more about demonstrating a new capability that TSLMs enable. To be fair, the paper did test standard LLMs, which consistently underperformed. @iLoveOncall, can you point to examples where out of the box models achieved good results on multiple time-series? Also, what kind of time-series data did you analyze with Claude 3.5? What exactly did you predict, and how did you assess reasoning capabilities?

[flagged]