`HASH` looks like a slow function ... does something like `rand() + rowid & 0` or `((rand() * p53 + rowid) % p53) / p53` work?
`HASH` looks like a slow function ... does something like `rand() + rowid & 0` or `((rand() * p53 + rowid) % p53) / p53` work?
Generally, table scans dominate the cost of sampling, so evaluating a "slow" function once per row doesn't matter. What does matter is whether you can push filtering expressions down into the scans to eliminate I/O and decoding work early. Some systems have trouble pushing down RAND efficiency, which can make alternatives like the deterministic function I shared advantageous.