Introduction to Apache Hadoop

Sarah Sproehnle (Cloudera, Inc.)
Data Science Ballroom G
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Average rating: ****.
(4.71, 7 ratings)

This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems.

The agenda will include:

  • The rationale for Hadoop
  • Understanding the Hadoop Distributed File System (HDFS) and MapReduce
  • Common Hadoop use cases including recommendation engines, ETL, time-series analysis and more
  • How Hadoop integrates with other systems like Relational Databases and Data Warehouses
  • Overview of the other components in a typical Hadoop “stack” such as these Apache projects: Hive, Pig, HBase, Sqoop, Flume and Oozie

No programming experience is required for this session.

Photo of Sarah Sproehnle

Sarah Sproehnle

Cloudera, Inc.

Sarah Sproehnle is the Director of Educational Services for Cloudera where she helps customers learn to use Apache Hadoop for big data processing. Cloudera provides commercial support, training and services for the Apache Hadoop platform.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts