Technically modern LLMs are handicapped on translation tasks compared to the original transformer architecture. The origami transformer got to see future context as well as past tokens.
Technically modern LLMs are handicapped on translation tasks compared to the original transformer architecture. The origami transformer got to see future context as well as past tokens.
Okay, but I'm not really concerned with the state of the art now with any specific technology, but what will be the state of the art in 20 years.