Interesting post. Did the difference in wall clock training time take the reduction in cold start time into account? Seems like that could be a significant factor for small jobs and negligible for large ones.
higher abstraction than Tinker, more flexible than OpenAI RFT. i like integration to production inference, so i can switch between training and inference for continuous learning.
Interesting post. Did the difference in wall clock training time take the reduction in cold start time into account? Seems like that could be a significant factor for small jobs and negligible for large ones.
Will the rate limits go higher? How about other models? Qwen 2.5 is nice but 3 is nicer
higher abstraction than Tinker, more flexible than OpenAI RFT. i like integration to production inference, so i can switch between training and inference for continuous learning.