As the world becomes more heavily instrumented, we are collecting massive amounts of raw data which often sits unused in log files or data warehouses. At the same time, statistical techniques, cloud computing, and software frameworks have matured to a point where a small team or even a single person can rapidly extract insights and build products on top of this data.
As a case study, this talk will combine datasets from Twitter, LinkedIn, Wikipedia, Mechanical Turk and other sources to extract insights about the O’Reilly Strata Conference and its attendees.
During the process, we will walk you through the nuts & bolts of building data products using these tools. We will cover coming up with ideas, tracking down or creating data, using visualizations during development, wiring together a prototype, and show some algorithmic tricks that can get you out of a jam.
Pete Skomoroch is a Research Scientist at LinkedIn, focusing on building data driven products. For the past several years, he has been a consultant at Data Wrangling in Washington, DC, working on projects involving search, finance, and recommendation systems. Before joining LinkedIn, he was the Director of Advanced Analytics at Juice Analytics and a Sr. Research Engineer at AOL Search. He spent the previous 6 years in Boston implementing pattern detection algorithms for streaming sensor data at MIT Lincoln Laboratory and constructing predictive models for large retail datasets at Profitlogic. Pete has a B.S. in Mathematics and Physics from Brandeis University.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com
Download the Strata Sponsor/Exhibitor Prospectus
View a complete list of Strata Contacts
Comments
Sure thing, I’ll get that Starta attendee cluster analysis up on my blog or on the O’Reilly Strata site.
There was a neat graph clustering Strata attendees. Would you mind sharing it?