Hacker News

Is this a technical impossibility or just it hasn't been done yet? Could a library support generating intrinsics for a large set of architectures?

jandrewrogers 2 hours ago [ - ]

The full scope of what SIMD is used for is much larger than parallelizing evaluation of numeric types and algorithms.

For example, it is used for parallel evaluation of complex constraints on unrelated types simultaneously while packed into a single vector. Think a WHERE clause on an arbitrary SQL schema evaluated in full parallel in a handful of clock cycles. SIMD turns out to be brilliant for this but it looks nothing like auto-vectorization.

None of the SIMD libraries like Google Highway cover this case.

camel-cdr an hour ago [ - ]

I don't quite get how something like highway doesn't cover this, while intrinsics do.

Can you explain the usecase more concretely?

jandrewrogers 12 minutes ago [ - ]

Almost literally what I stated. Consider a row in Postgres table or similar. Convert the entire WHERE clause across all columns in that table into a very short sequence of SIMD instructions against the same memory. All of the columns, regardless of type, are evaluated simultaneously using SIMD. For many complex constraints you can match rows in single digit clock cycles even across many unrelated types. This is much faster than using secondary indexes in many cases.

It isn’t hypothetical, I’ve shipped systems that worked this way. You can match search patterns across a random dozen columns across a schema of hundreds of columns at essentially full memory bandwidth.

loeg 2 hours ago [ - ]

Google Highway gets mentioned in the article.

mattip 2 hours ago [ - ]

There is google’s highway, that provides an abstraction layer. It is used by NumPy.