Distributed applications running on Hadoop clusters can deliver powerful insights and results from the biggest data sets ever generated. But do you have to be a rocket scientist to use it? Fortunately, the answer is no. This tutorial will explain the theory of MapReduce and how to develop big data applications in Java and higher level languages such as Pig and Hive SQL. Using practical, real-world examples such as weblog processing, analytics, and text summarization, it will cover how to prototype, debug, monitor, test and optimize big data applications for Hadoop’s distributed processing platform. Attendees will get hands-on instruction and will leave with a solid understanding of how to analyze data on Hadoop clusters and practical examples they can use and build on after the tutorial.
The tutorial will be in 5 parts:
Part 1 (30 mins): What you need to know about MapReduce and Hadoop
Part 2 (45 mins): Rapid prototyping and ad hoc analytics
Part 3 (15 mins): Real world case study
Part 4 (60 mins): Hands on instruction with practical examples
Part 5 (30 mins): Your questions answered
Abe was senior director of engineering for Ask.com where he led engineering for all Dictionary.com properties. Before that, Abe was director of engineering at Ning where he worked on Hadoop-based solutions for Ning users and led development of the Ning data services platform and systems management services.
Previously, Abe managed development for the Google Apps Infrastructure at Google, and while at Yahoo! served as senior engineering manager for several units including Del.icio.us, Social Search Platform, Search Front-End Platform and Listings Platform. In addition, Abe has held engineering positions at technology companies including CNA eBusiness, Scient and Jackson Software.
Abe has completed course requirements for a PhD and holds a MS in Theoretical and Applied Mechanics from University of Illinois at Urbana-Champaign and earned a MS and BS in Mechanical Engineering from Cairo University.
Abe, the recipient of numerous academic awards and fellowships also holds a patent for development of a search engine with augmented relevance ranking by community participation.
Abe is currently the VP Engineering at Karmasphere.
Shevek is a widely-recognized mathematician and computer programmer with specific expertise in the Java programming language. His considerable experience ranges from theoretical computing to team management. Through the course of his career he has worked on cutting-edge academic research in systems, compilers, programming languages and computer security. Shevek has worked with organizations such as the U.K. Department of Trade and Industry, Raytheon Systems, and Weir, Strachan and Henshaw in the defense and nuclear industries. He received a Doctorate in Computing from University of Bath, Bath, England. Shevek also holds a Masters in Mathematics, with Honors, from University of Bath.
Shevek is CTO and co-founder of Karmasphere.
Ken is the founder of Bixo Labs, a web mining and data processing company that creates custom solutions for big data processing workflows. Previously he was the founder and CTO of Krugle, a vertical search engine and enterprise appliance for code and technical information. He’s a member of the Apache Foundation, a committer on the Tika and Bixo open source projects, and teaches Hadoop and Solr courses for Scale Unlimited and Lucid Imagination.
Chris Wensel is the founder of Concurrent Inc., and the author of the
Cascading data processing open-source project. He also co-founded Scale
Unlimited, the first Hadoop and “Big Data” related professional services and
training company, where he mentored companies like Sun Microsystems, Apple,
and numerous startups in the Bay Area.
Chris bootstrapped his first Internet startup in the early 90’s, creating an
early Web server-side scripting language used by companies in the
real estate and insurance verticals. During the late 90’s, Chris focused on
distributed-agent based systems where he received several patents on
distributed computing. From there he became Chief Architect for the fastest
growing business unit at Thomson Reuters. Just prior to Concurrent, Chris
was a Consulting Architect to TeleAtlas geo-content management group in
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at email@example.com
Download the Strata Sponsor/Exhibitor Prospectus
View a complete list of Strata Contacts