Hacker News new | past | comments | ask | show | jobs | submit login

Not really, you said SparkSQL doesn't support gz, which is incorrect and the thrust of my comment. The anecdote about parquet is orthogonal to gz support.

pedantic sidebar: hdfs isn't a file format, it's a distributed file system layered over a traditional on-disk filesystem. For example you might have: json logs, in a gz-formatted file, tracked in the hdfs filesystem, stored on disk in an ext4-formatted filesystem.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: