They might be converging somewhat. The ultimate limiting factor is training data. Eventually I think they will converge and then the competition will be on memory and compute efficiency, with the best being the smallest maximally capable model.
They might be converging somewhat. The ultimate limiting factor is training data. Eventually I think they will converge and then the competition will be on memory and compute efficiency, with the best being the smallest maximally capable model.