parquet is optimized for storage and compresses well (=> smaller files)
feather is optimized for fast reading
Given the cost of storage is getting cheaper, wouldn't most firms want to use feather for analytic performance? But everyone uses parquet.
There's definitely a "everyone uses it because everyone uses it" effect.
Feather might be a better fit for sime yse cases, but parquet has fantastic support and is still a pretty good choice for things that feather does.
Unless they're really focussed on eaking out every bit of read performance, people often opt for the well supported path instead.
You can, still, gain a lot of performance by doing less I/O.
What people have done in the face of cheaper storage is store more data.
Storage getting cheaper did not really reach the cloud providers and for self-hosting it has recently gotten even more expensive due to AI bs.
And now there's Lance! https://lance.org/
Given the cost of storage is getting cheaper, wouldn't most firms want to use feather for analytic performance? But everyone uses parquet.
There's definitely a "everyone uses it because everyone uses it" effect.
Feather might be a better fit for sime yse cases, but parquet has fantastic support and is still a pretty good choice for things that feather does.
Unless they're really focussed on eaking out every bit of read performance, people often opt for the well supported path instead.
You can, still, gain a lot of performance by doing less I/O.
What people have done in the face of cheaper storage is store more data.
Storage getting cheaper did not really reach the cloud providers and for self-hosting it has recently gotten even more expensive due to AI bs.
And now there's Lance! https://lance.org/