Large Scale ETL with Hadoop

Eric Sammer (ScalingData)
Hadoop: Tools & Technology, Grand East (NY Hilton)
Presentation: external link
Average rating: ****.
(4.25, 8 ratings)

Hadoop is commonly used for processing large swaths of data in batch. While many of the necessary building blocks for data processing exist within the Hadoop ecosystem – HDFS, MapReduce, HBase, Hive, Pig, Oozie, and so on – it can be a challenge to assemble and operationalize them as a production ETL platform. This presentation covers one approach to data ingest, organization, format selection, process orchestration, and external system integration, based on collective experience acquired across many production Hadoop deployments.

Photo of Eric Sammer

Eric Sammer

ScalingData

Eric Sammer is currently a Principal Solution Architect at Cloudera where he helps customers plan, deploy, develop for, and use Hadoop and the related projects at scale. His background is in the development and operations of distributed, highly concurrent, data ingest and processing systems. He’s been involved in the open source community and has contributed to a large number of projects over the last decade.

Comments on this page are now closed.

Comments

Picture of Sophia DeMartini
Sophia DeMartini
11/02/2012 6:59pm EDT

The slides for this talk are now available on SlideShare, and can be accessed via the “external link” above.

Picture of Sophia DeMartini
Sophia DeMartini
11/01/2012 2:46pm EDT

Hi Bharath,

we’ll be posting the slides if the speaker provides them to us.

Thank you, Sophia

Bharath Mundlapudi
11/01/2012 2:38pm EDT

Will someone be posting the slides for this presentation?

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com.

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts.