I really wish Zig decided to add SIMD intrisics. There are many SIMD algorithms that can be done, but you have to switch back to C for those, because they depend on operations outside of what LLVM provides for vectors.
I really wish Zig decided to add SIMD intrisics. There are many SIMD algorithms that can be done, but you have to switch back to C for those, because they depend on operations outside of what LLVM provides for vectors.
Missing SIMD functionality is welcome issue reports. Obviously we're not going to copy the C intrinsics directly since Zig SIMD is portable, but everything should be expressible.
It doesn't really have to do with what operations LLVM provides for vectors. LLVM supports all the SIMD intrinsics of clang, and LLVM is one of many backends of zig.
Like what?
You can also directly call LLVM intrinsics in case this doesn’t work
Just a couple of days ago, I wanted to implement specialized StreamVByte decoding in Zig, but @shuffle() needs to mask to be compile time known, while _mm_shuffle_epi8() works just fine with a dynamic mask. I remember that some time ago, I couldn't find an alternative to _mm_alignr_epi8().
`_mm_alignr_epi8` is a compile-time known shuffle that gets optimized well by LLVM [1].
If you need the exact behavior of `pshufb` you can use asm or the llvm intrinsic [2]. iirc, I once got the compiler to emit a `pshufb` for a runtime shuffle... that always guaranteed indices in the 0..15 range?
Ironically, I also wanted to try zig by doing a StreamVByte implementation, but got derailed by the lack of SSE/AVX intrinsics support.
[1] https://github.com/aqrit/sse2zig/blob/444ed8d129625ab5deec34... [2] https://github.com/aqrit/sse2zig/blob/444ed8d129625ab5deec34...
Oh, that's actually quite neat, it did not occur to me that you can use @shuffle with a compile time mask and it will optimize it to a specialized instruction.