Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Arrow is really the future here



Isn’t Apache Arrow an in memory format that the various DataFrame libraries can standardise on to interact with each other? inter-process communication (IPC)?

My understanding is your raw data on disk is still a format such as Parquet, but when you load that Parquet in to your application it’s stored as Arrow in-memory for processing?


Arrow also has its own on-disk format called Feather - https://arrow.apache.org/docs/python/feather.html




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: