Street Fighting Data Science

Peter Skomoroch (Data Wrangling)
Data Science, Mission City B1
Average rating: ***..
(3.00, 2 ratings)

Practical problem solving with data involves more than just visualization or applying the latest machine learning techniques. Intuition, domain knowledge, and reasonable approximations can mean the difference between a successful model and a catastrophic failure. We’ll dive into some best practices I’ve extracted from solving real world problems like computing trending topics, finding related searches, cleaning election data, and ranking experts on social networks.

New analysts or engineers are often lost when textbook approaches fail on real world data. Drawing inspiration from problem solving techniques in mathematics and physics, we will walk through examples that illustrate how come up with creative solutions and solve problems with big data.

  • Creating Models
  • Sampling & Approximation
  • Finding Edge Cases
  • Testing Extremes
  • Working Backwards
  • Joining to External Data & Crowdsourcing
  • Turning Errors into Improvements
Photo of Peter Skomoroch

Peter Skomoroch

Data Wrangling

Pete Skomoroch is a Principal Data Scientist at LinkedIn focused on reputation systems, personalization, and creating data driven products like LinkedIn Skills. Before joining LinkedIn, he was the Director of Advanced Analytics at Juice Analytics and a Sr. Research Engineer at AOL Search. Prior to AOL, he implemented pattern detection algorithms for streaming sensor data at MIT Lincoln Laboratory and constructed predictive models for large retail datasets. Pete has a B.S. in Mathematics and Physics from Brandeis University and blogs at DataWrangling.com.

Sponsors

  • EMC
  • Microsoft
  • HPCC Systems™ from LexisNexis® Risk Solutions
  • MarkLogic
  • Shared Learning Collaborative
  • Cloudera
  • Digital Reasoning Systems
  • Pentaho
  • Rackspace Hosting
  • Teradata Aster
  • VMware
  • IBM
  • NetApp
  • Oracle
  • 1010data
  • 10gen
  • Acxiom
  • Amazon Web Services
  • Calpont
  • Cisco
  • Couchbase
  • Cray
  • Datameer
  • DataSift
  • DataStax
  • Esri
  • Facebook
  • Feedzai
  • Hadapt
  • Hortonworks
  • Impetus
  • Jaspersoft
  • Karmasphere
  • Lucid Imagination
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Scaleout Software
  • Skytree, Inc.
  • Splunk
  • Tableau Software
  • Talend

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com.

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

View a complete list of Strata contacts