Crunching Big Data with R and Hadoop

Ed Kohlwey (Booz Allen Hamilton), Stephanie Beben (Booz Allen Hamilton)
Hadoop: Tools & Technology, Sutton Center / Sutton South (NY Hilton)
Tutorial Please note: to attend, your registration must include Tutorials.
Average rating: ***..
(3.43, 7 ratings)

Implementing Map/Reduce applications using tools like Java can be hard; as a result, it is often useful to be able to use Map/Reduce from other languages. In this tutorial, we’ll provide an introduction to RHadoop, an open source Map/Reduce library for R. We will assume that attendees have a broad familiarity with R and Hadoop, however the exercises do not require attendees to be an expert in either platform.

First, we will discuss the basics of Map/Reduce, a framework for writing massively parallel big data analytics, and the nuances of the RHadoop implementation.

Next, we’ll discuss some common techniques in RHadoop including maintaining application state, processing data that has a Zipfian distribution, representing distributed matrices, performing basic operations over distributed matrices, finding outliers, and debugging.

Finally, we’ll walk through an interactive exercise to show attendees how to create a trending topic analysis using LDA and RHadoop. First, we’ll show attendees how to install both Hadoop and the rmr package, which provides Map/Reduce functionality. Then we’ll walk through an interactive coding example that demonstrates how to actually use RHadoop to create a sliding window analysis of trending topics.

Photo of Ed Kohlwey

Ed Kohlwey

Booz Allen Hamilton

Edmund Kohlwey is a developer and data scientist at Booz Allen Hamilton. For the last three years, he has helped government clients adopt and develop their big data capabilities across many different problem domains.

Photo of Stephanie Beben

Stephanie Beben

Booz Allen Hamilton

Stephanie Beben is an analytics engineer and developer at Booz Allen Hamilton with two years experience designing and implementing solutions to big data problems using cloud technologies for U.S. government clients.

Prior to joining Booz Allen Hamilton, Stephanie received a M.S. in Mathematics from Texas A&M University.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com.

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts.