This is how language models have worked since their inception, and has been steadily improved since about 2018.
See embedding models.
> they removed the tokenizer altogether
This is an active research topic, no real solution in sight yet.
This is how language models have worked since their inception, and has been steadily improved since about 2018.
See embedding models.
> they removed the tokenizer altogether
This is an active research topic, no real solution in sight yet.