Drill to Detail Ep.44 'Pandas, Apache Arrow and In-Memory Analytics' With Special Guest Wes McKinney

December 7, 2017 Mark Rittman

Mark is joined in this episode of Drill to Detail by Wes McKinney, to talk about the origins of the Python Pandas open-source package for data analysis and his subsequent work as a contributor to the Kudu (incubating) and Parquet projects within the Apache Software Foundation and Arrow, an in-memory data structure specification for use by engineers building data systems and the de-facto standard for columnar in-memory processing and interchange.

Python Data Analysis Library
"Ibis on Impala: Python at Scale for Data Science"
Drill To Detail Ep.3 'Apache Kudu And Cloudera's Analytic Platform' With Special Guest Mike Percy
Apache Arrow homepage
"Apache Arrow and the "10 Things I Hate About pandas"
"Apache Arrow vs. Parquet and ORC: Do we really need a third Apache project for columnar data representation?"
"Some comments to Daniel Abadi's blog about Apache Arrow"
Wes McKinney homepage

Your browser doesn't support HTML5 audio

Drill to Detail Ep.44 'Pandas, Apache Arrow and In-Memory Analytics' With Special Guest Wes McKinney Mark Rittman