As massive data acquisition and storage becomes increasingly affordable, a wide variety of enterprises are employing statisticians to engage in sophisticated data analysis. In this talk we highlight the emerging practice of Magnetic, Agile, Deep (MAD) data analysis as a radical departure from traditional Enterprise Data Warehouses and Business Intelligence. We present architecture, techniques and experience in the field providing MAD analytics for businesses and researchers confronted with ever-expanding data sets.
We describe design methodologies that support the agile working style of analysts in these settings. We present data-parallel algorithms for statistical techniques, with a focus on density methods. Finally, we reflect on system features that enable agile design and flexible algorithm development, with lessons for both SQL and MapReduce programming styles.
Brian is a business analyst with a Masters in Biomathematics from the David Geffen School of Medicine at UCLA and a Masters in Pure Mathematics from the University of California, Los Angeles. Brian has more than 17 years of analytic experience, and has worked at Yahoo, MySpace, UCLA. He turns massive peta-scale data sets, from multi-terabyte databases , into strategic decisions. His forte is mapping complicated mathematical models into actionable business solutions. He is a frequent speaker on Machine learning topics at the Institute for Business Forecasters, UCLA, UC Berkeley, and SAS and is the co-author of several relevant white papers including the Mad Skills paper.
Joseph M. Hellerstein is a Professor of Computer Science at the University of California, Berkeley, whose work focuses on data-centric systems and the way they drive computing. He is an ACM Fellow, an Alfred P. Sloan Research Fellow and the recipient of two ACM-SIGMOD “Test of Time” awards for his research. In 2010, Fortune Magazine included him in their list of 50 smartest people in technology , and MIT’s Technology Review magazine included his work on Distributed Programming on their 2010 TR10 list of the 10 technologies “most likely to change our world”. Key ideas from his research have been incorporated into commercial and open-source software from IBM, Oracle, and PostgreSQL. He is a past director of Intel Research Berkeley, and has served on the technical advisory boards of a number of technology companies including EMC, Swivel and SurveyMonkey.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com
Download the Strata Sponsor/Exhibitor Prospectus
View a complete list of Strata Contacts
Comments
slides were uploaded today, should show up soon!
I can’t wait to share this info with my colleagues. This was one of my favorite sessions!
I wish Strata could involve more people from academic research!