Hacker News

rennokki 13 hours ago [ - ]

> Uses jq for TB json files

> Hadoop: bro

> Spark: bro

> hive: bro

> data team: bro

eevmanu 7 hours ago [ - ]

made me remember this article

<https://adamdrake.com/command-line-tools-can-be-235x-faster-...>

  Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)

  Conclusion: Hopefully this has illustrated some points about using and abusing tools like Hadoop for data processing tasks that can better be accomplished on a single machine with simple shell commands and tools.

f311a 10 hours ago [ - ]

JQ is very convenient, even if your files are more than 100GB. I often need to extract one field from huge JSON line files, I just pipe jq to it to get results. It's slower, but implementing proper data processing will take more time.

anonymoushn 10 hours ago [ - ]

are those tools known for their fast json parsers?

11 hours ago [ - ]

[deleted]