Thanks for the tip! Your comment prompted me to refactor the quote handling - replaced the bit-by-bit state machine loop with prefix XOR, and switched to adjacent bit masking for double-quote detection. Seeing a nice performance improvement in benchmarks. Go's simd/archsimd doesn't have CLMUL yet, but the XOR cascade works well. Appreciate your feedback!