Sure, but I have to support a range of target CPUs in the consumer desktop market, and the older CPUs are the ones that need optimizations the most. That means NEON on ARM64 and AVX2 or SSE2-4 on x64. Time spent on higher vector instruction sets benefits a smaller fraction of the user base that already has better performance, and that's especially problematic if the algorithm has to be reworked to take best advantage of the higher extensions.
AVX-512 is also in bad shape market-wise, despite its amazing feature set and how long it's been since initial release. The Steam Hardware Survey, which skews toward the higher end of the market, only shows 18% of the user base having AVX-512 support. And even that is despite Intel's best efforts to reverse progress by shipping all new consumer CPUs with AVX-512 support disabled.