Highly recommend https://www.vldb.org/pvldb/vol16/p2132-afroozeh.pdf for a comparable algorithm. It generalizes to arbitrary input and output bit widths.
Highly recommend https://www.vldb.org/pvldb/vol16/p2132-afroozeh.pdf for a comparable algorithm. It generalizes to arbitrary input and output bit widths.
Good work, wish I saw it earlier as it overlaps with a lot of my recent work. I'm actually planning to release new SOTAs on zigzag/delta/delta-of-delta/xor-with-previous coding next week. Some areas the work doesn't give enough attention to (IMO): register pressure, kernel fusion, cache locality (wrt multi-pass). They also fudge a lot of the numbers and comparisons eg, pitching themselves against Parquet (storage-optimised) when Arrow (compute-optimised) is the most-comparable tech and obvious target to beat. They definitely improve on current best work, but only by a modest margin.
I'm also skeptical of the "unified" paradigm: performance improvements are often realised by stepping away from generalisation and exploiting the specifics of a given problem; under a unified paradigm there's definite room for improvement vs Arrow, but that's very unlikely to bring you all the way to theoretically optimal performance.