Matei Zaharia

Matei Zaharia
CTO, Databricks

Website

Matei Zaharia is a fifth-year PhD student at UC Berkeley, working with Scott Shenker and Ion Stoica on topics in cloud computing, operating systems, networking, and algorithms for large-scale data processing. He is the lead developer of the Spark programming framework, and also a committer on Apache Mesos and Apache Hadoop. He got his undergraduate degree at the University of Waterloo in Canada.

Sessions

Beyond Hadoop Ballroom G
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Ion Stoica (UC Berkeley), Matei Zaharia (Databricks), Reynold Xin (Databricks), Shivaram Venkataraman (UC Berkeley), Andy Konwinski (UC Berkeley), Tathagata Das (Databricks)
Average rating: *****
(5.00, 3 ratings)
An introduction Spark and Shark, two components of the open-source Berkeley Data Analytics Stack (BDAS) in development at UC Berkeley. Spark is a high-speed cluster computing system compatible with Hadoop that can outperform it by up to 100x. Shark is a port of Apache Hive onto Spark that is fully compatible with, and up to 100x faster than, Hive. Read more.
Beyond Hadoop Room 204
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Matei Zaharia (Databricks), Reynold Xin (Databricks), Andy Konwinski (UC Berkeley), Tathagata Das (Databricks), Patrick Wendell (Databricks)
Average rating: ****.
(4.00, 1 rating)
Building on our previous tutorial introducing BDAS, the open-source Berkeley Data Analytics Stack, in this tutorial we will provide each audience member with a Spark/Shark cluster on EC2 and walk through hands-on coding examples. Lessons will cover the Spark and Shark command line interfaces, writing a standalone program, and data clustering using a distributed machine learning algorithm on Spark. Read more.
Expo Hall (Table C)
Reynold Xin (Databricks), Matei Zaharia (Databricks), Ion Stoica (UC Berkeley), Andy Konwinski (UC Berkeley), Tathagata Das (Databricks), Patrick Wendell (Databricks), Shivaram Venkataraman (UC Berkeley)
Average rating: ***..
(3.00, 1 rating)
Beyond Hadoop Great America Ballroom J
Sharmila Shahani-Mulligan (ClearStory Data), Matei Zaharia (Databricks), Stephanie McReynolds (ClearStory Data)
Average rating: ****.
(4.00, 2 ratings)
AMPLab’s open source data analysis projects, Spark and Shark, deliver iterative queries up to 100x faster than Hadoop MapReduce. Hear how companies are using Spark-based data platforms for fast, interactive analysis on big data. Read more.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts