Mendeley is a New York and London-based startup that has crowdsourced the world's largest database of academic literature. Over 1M researchers strong, Mendeley is taking academia to the cloud.
Learn now how to use a Hadoop cluster for data analysis using Java MapReduce, Apache Hive and Apache Pig, and get an overview of using the HBase Hadoop database. Some programming experience is strongly recommended for this session.
This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems. No programming experience is required.
This talk will go in details, architecture and challenges of building a recommendation system on a massive social graph. The talk will describe how we applied learning on large datasets using Apache Hadoop and how we scaled millions of reads and writes. We will also showcase Eventbrite's data platform architecture.