Dealing With Bad Data

Q Ethan McCallum (@qethanm)
Data Science, Mission City B1
Average rating: *....
(1.00, 1 rating)

The biggest problem in data science is … the data itself.

It’s messy, it’s inconsistent, it arrives from myriad sources, and it sometimes changes without warning. Such hurdles distract you from your intended purpose: getting meaningful insight out of your data.

Q Ethan McCallum, consultant and author of Parallel R (O’Reilly), will walk through the various forms of bad data and explore common pitfalls that can derail your research efforts. Most of all, he’ll explain ways to handle bad data so you can get back to work.

Photo of Q Ethan McCallum

Q Ethan McCallum

@qethanm

Q Ethan McCallum is a consultant, writer, and technology enthusiast, though perhaps not in that order. Most recently put the finishing touches on Parallel R (O’Reilly).

Sponsors

  • EMC
  • Microsoft
  • HPCC Systems™ from LexisNexis® Risk Solutions
  • MarkLogic
  • Shared Learning Collaborative
  • Cloudera
  • Digital Reasoning Systems
  • Pentaho
  • Rackspace Hosting
  • Teradata Aster
  • VMware
  • IBM
  • NetApp
  • Oracle
  • 1010data
  • 10gen
  • Acxiom
  • Amazon Web Services
  • Calpont
  • Cisco
  • Couchbase
  • Cray
  • Datameer
  • DataSift
  • DataStax
  • Esri
  • Facebook
  • Feedzai
  • Hadapt
  • Hortonworks
  • Impetus
  • Jaspersoft
  • Karmasphere
  • Lucid Imagination
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Scaleout Software
  • Skytree, Inc.
  • Splunk
  • Tableau Software
  • Talend

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com.

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

View a complete list of Strata contacts