Skip to main content
Hadoop in Action

Hadoop in Action

In this track, we look at real-world case studies of the Hadoop ecosystem in action, from disruptive startups to industry giants. See how early adopters are finding ways to change the rules and rearrange the playing field—and the emerging application stack.

Who should attend: Business decision-makers and project coordinators; technology planners; architects; industry analysts and investors.

Track Hosts

Martin Hall is co-founder, Chairman & Executive Vice President of Corporate Development at Karmasphere. He brings a strong entrepreneurial track record and a history of pioneering new Internet technologies and markets. Prior to founding Karmasphere, Martin was a founder of Aventail, a leading computer security company acquired by SonicWall. Prior to that, he was the founding CEO of Stardust, an Internet technology services company sold to Penton Media. Martin has chaired and participated in a number of industry groups including WinSock, Quality of Service, Internet Multicast and Wireless Multimedia Forums. He holds a Masters of Computer Science from Staffordshire University in Stafford, England.

Andrew Musselman is Chief Data Scientist in the global big data practice at Accenture. His background is in math, front- and back-end web, recommenders, and other large-scale modeling and prediction systems. In addition to building systems for clients, Andrew builds internal tools for performing data science and engineering quicker and more rigorously, and does recruitment and training in a growing team. He is a big fan of Hadoop, Pig, and Mahout, and actively promotes new tools within the firm and with clients.

Add to your personal schedule
Sutton Center - Sutton South
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Tom White (Cloudera), Eric Sammer (Cloudera), Joey Echeverria (Cloudera)
Average rating: ***..
(3.71, 14 ratings)
In this tutorial we'll use the Cloudera Development Kit (CDK) to build a Java web app that logs application events to Hadoop, and then run ad hoc and scheduled queries against the collected data. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Tutorial Please note: to attend, your registration must include Tutorials on Monday.
Sean Murphy (JHU), Benjamin Bengfort (Cobrain Company and University of Maryland)
Average rating: ****.
(4.56, 18 ratings)
Much of the world’s data (and your own) is text. The key to unlocking its value is in a series of Natural Language Processing transformations that turn raw strings into a machine usable form. We will use Hadoop alongside Python’s NLTK to do these steps and discuss why each is necessary in your application. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Richard Park (Linkedin Corp)
Average rating: ****.
(4.17, 6 ratings)
Azkaban is an open-source workflow management application developed at LinkedIn to schedule and run our Hadoop workflows. Sporting a beautiful web UI, it is designed to be scalable, reliable, modular, secure and extensible. Azkaban has been battle tested on LinkedIn's Hadoop clusters, driving all of our data products over the last few years. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Adam Kawa (Spotify)
Average rating: ****.
(4.56, 9 ratings)
A trip into Hadoop jungle to show the most interesting, exciting and surprising places where we have been to while growing fast from a 60 to 690-node Hadoop cluster. We will expose our JIRA tickets, real graphs, statistics, even excerpts from our dialogues. We will share the mistakes that we made and describe the fixes that finally domesticated this love-demanding yellow elephant and its friends. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Barry Livingston (Riot Games), Ben Werther (Platfora)
Average rating: ****.
(4.56, 9 ratings)
Riot Games has built the most played video game in the world - League of Legends - and they need to constantly monitor, develop, and update their games to keep players engaged. Learn about different data architecture approaches more about the Riot Games’ “Data Collection Pipeline” that provides insights into what’s needed to continuously improve the gamers experience. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Mark Slusar (Allstate)
Average rating: ****.
(4.00, 4 ratings)
After a successful round of Hadoop Data Science projects, a company will make a sizable Hadoop commitment. People, process, and technology stand at the tipping point for an exciting adventure in innovation and evolution that creates new possibilities. This presentation educates attendees on the changes from the traditional methods to the new methods and paints a vision of the future. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Zach Snyder (The Walt Disney Company)
Average rating: ***..
(3.29, 7 ratings)
Managing Hadoop clusters to meet business needs can be challenging. Learn how Disney uses an integrated approach, leveraging both Hadoop-specific tools and common IT management tools to create a comprehensive management toolkit for our Hadoop clusters. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Erich Hochmuth (Monsanto), Amandeep Khurana (Cloudera)
Average rating: ***..
(3.75, 8 ratings)
Monsanto is building new technology driven products for their customers that will leverage big data. This talk describes how Monsanto is building these scalable applications with geospatial data, using Hadoop and HBase as the backend systems. Read more.
Add to your personal schedule
Gramercy Suite
Russell Sears (Microsoft)
REEF is a set of tools and services that make it easy to implement new scalable computational frameworks atop YARN, and to allow jobs to perform multiple types of computations, such as MapReduce, iterative machine learning and graph processing. We plan to support additional programming models over time. REEF is language-independent, allowing it to bridge the Java and .NET ecosystems. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Feng Peng (Twitter.com)
Average rating: ****.
(4.25, 8 ratings)
At Twitter our Hadoop-centric data analytics pipeline has been rapidly growing in terms of both size and complexity. With thousands of evolving data sources and analytics programs, orchestrating the analytics production becomes extremely difficult without a systematic solution. We will describe our production challenges and illustrate how the service we built help us address them. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Chris Lintz (Comcast), Gabriel Commeau (Comcast)
Average rating: ****.
(4.00, 8 ratings)
Real-time analytics produced by IP video players help ensure that Comcast delivers the highest quality experience to customers. While ingesting as many messages as Tweets produced every day, these real-time insights are achieved through an in-house architecture leveraging Flume NG and Storm. Read more.
Add to your personal schedule
Sutton Center - Sutton South
Scott Sorensen (Ancestry.com)
Average rating: ***..
(3.50, 4 ratings)
New, affordable DNA sequencing will generate massive new flows of data. Ancestry.com currently manages 4 petabytes of searchable data and is on track to increase this figure exponentially with its new DNA product. Ancestry.com CTO, Scott Sorensen, explains how the company manages tremendous amounts of new data through two categories of Hadoop use cases: 1) analytics and 2) product features. Read more.
Add to your personal schedule
Gramercy Suite A
RAVI HUBBLY (Lockheed Martin)
Enterprises continue to rely on legacy mainframe-based systems even though utilizing these legacy systems is prone to risks. This is mainly because prior efforts at modernization of these legacy systems have been difficult. In this topic we will discuss usage scenarios where utilizing Hadoop has assisted in modernizing legacy systems and position businesses for big data benefits. Read more.
Add to your personal schedule
Gramercy Suite A
David Thompson (Western Union)
Average rating: ***..
(3.60, 5 ratings)
In business there are demands that, if not managed well, can cause friction. This friction can be between colleagues and it can be felt by customers and clients. Consider financial services. Leaders constantly face pressures, from meeting revenue targets and consumer needs to engaging in activities like honoring individuals’ privacy rights and protecting people and the business from fraud. Read more.

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts