Hacker News

tinyhouse 3 days ago [ - ]

This is great as we need more solutions that help people train and fine-tune models. There is a lot of open source and some companies behind some of the popular open source packages, like Unsloth, but the more the merrier, esp given that Thinking Machines has the expertise and resources to build something that last.

NitpickLawyer 3 days ago [ - ]

Yeah, I'm really curious about their stacked multi-tenant lora training at the same time. If this gets commoditised enough, it could be interesting to try "end of the day fine-tunes on daily conversations" and see where that leads. Or a targeted RL on "missed / rejected tasks" for an agent, after you get enough samples for a run, and so on.