You can target the minimum instruction set and it'll run everywhere. Albeit very slowly. Perhaps you use a fat binary to get reasonable performance in most cases.
This isn't easy but it can be done (and it is being done on x86, despite constantly evolving variations of AVX).