Agile Data Wrangling and Web-based Visualizations

Chang She (DataPad)
Design Ballroom CD
Average rating: ***..
(3.30, 10 ratings)

Data does not speak for itself. While successful data science is dependent on data and modeling, visualization is also essential for communicating the patterns in the data and the context around it. Although many solutions are available to create rich interactive graphs and charts, they are generally too separated from the analysis process itself. This creates inefficiencies at the boundary and makes rapid iteration of research more difficult. Minimizing these inefficiencies is the focus of this talk. By combining agile data analysis tools with web-based visualization libraries, we can optimize the data scientist’s tool chain for both exploratory research and the presentation of results.

A proper solution needs to take into account all of the components of data analysis that happens between raw data and the final visualization.

  • Data processing – raw data must be loaded and cleaned, and then the model computations must be performed. I will use the pandas library in Python to give an overview of common data operations and research best practices.
  • Data reshaping – the output data along with supplemental data must be grouped, merged, and reshaped so they can be used for visualization. Fast and flexible data wrangling operations are essential for this step, and so are easy methods for converting the data into the right format such as JSON.
  • Data visualization – an efficient visualization toolkit must include reusable components with configurable parameters and default settings that make sense. It must enable the data scientist to create commonly used graphs and charts without too much effort spent on plumbing and scaffolding.
Photo of Chang She

Chang She

DataPad

Chang She is a cofounder of Lambda Foundry. From 2011 to 2012, he served as Assistant Vice President at Barclays Capital researching quantitative FX strategies and building research infrastructure. From 2006-2011, he worked at AQR Capital Management in global equities research and algorithm execution. He graduated from MIT with an M.Eng in Computer Science and S.B. degrees in Computer Science and Political Science.

Comments on this page are now closed.

Comments

Fred Morris
02/27/2013 11:12am PST

Please post links to the examples here. Thanks!

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts