Building Rich, High Performance Tools for Practical Data Analysis

Wes McKinney (DataPad Inc.)
Data Science, Sutton Center / Sutton South (NY Hilton)
Average rating: **...
(2.40, 5 ratings)

This will be a somewhat advanced, technical talk connecting computer science concepts like data structure design and algorithms with the details of building intuitive, high performance, and flexible tools for data analysis. It is an accumulation of lessons learned and experience gained building pandas, a widely used, battle-tested data analysis toolkit for Python. I will give a number of short code demonstrations as a means of illustrating the various points.

Some of the important topics here include missing data handling, simple and hierarchical indexing, efficient serialization, pivoting and reshaping, grouped data aggregation and transformation, time series-specific computations, and merge and join algorithms. I will also discuss structuring data for visualization and output to other tools such as JavaScript visualization toolkits like D3.js.

Photo of Wes McKinney

Wes McKinney

DataPad Inc.

Building analytics libraries and research tools for quantitative finance and other fields. Actively involved in data analysis and statistics applications in the scientific Python community. Author of pandas library, contributor to statsmodels. Upcoming author of “Python for Data Analysis” from O’Reilly Media. CEO of Lambda Foundry, Inc.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com.

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts.