You wont win much performance with a specific coding language tokenizer/vocabulary, everything else benefits from a larger model size. You can get distilled models that will out-perform or compete with your single domain coding model
You wont win much performance with a specific coding language tokenizer/vocabulary, everything else benefits from a larger model size. You can get distilled models that will out-perform or compete with your single domain coding model