Could it be that Anthropic is using the Chinese characters trick to consume less tokens behind the scenes?
It used a chinese character instead of the word "true"
Aren’t Unicode characters generally treated as 2 tokens to avoid a huge vocabulary?
It used a chinese character instead of the word "true"
Aren’t Unicode characters generally treated as 2 tokens to avoid a huge vocabulary?