Since they hide their thinking traces it really doesn't make too much sense. We know one of their fixed degradations they talked about in a recent blog post was if you left claude code idle for too long they would rehydrate it without the thinking traces in the context and it degraded performance. So direct forms of distillation wouldn't be expected to get as good of results as they are getting.

However, they could have used it as a judge etc. during training.