Strata 2013 Schedule

Below are the confirmed and scheduled talks at Strata Conference in Santa Clara 2013 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the tutorials, sessions, keynotes, and events you want to attend by selecting the calendar icon [calendar icon] next to each listing. Then go to your personal schedule and get your own customized schedule generated.

See the list of all events happening onsite, including events on Monday, February 25: Women's Community Meetup, Big Data Camp, and Ignite.

Ballroom AB
Add to your personal schedule
10:40am Try, Learn, Buy: Operations Research Meets AI Elisabeth Crawford (Birchbox)
Add to your personal schedule
11:30am Introduction to Forecasting Michael Bailey (Facebook)
Add to your personal schedule
1:30pm Next-Gen Data Scientists Rachel Schutt (Johnson Research Labs)
Add to your personal schedule
2:20pm The IPython Notebook: a Comprehensive Tool for Data Science Brian Granger (Cal Poly San Luis Obispo)
Add to your personal schedule
4:00pm Pervasive Data Munging Gremlins Bradley Voytek (UCSF & Uber, Inc.)
Add to your personal schedule
4:50pm What To Do When Your Machine Learning Gets Attacked Vishwanath Ramarao (Impermium)
Great America Ballroom K
Add to your personal schedule
11:30am Funnel Analysis in Hadoop at Etsy Matt Walker (Etsy), Wil Stuckey (Etsy), Steve Mardenfeld (etsy)
Add to your personal schedule
2:20pm Coordinating the Many Tools of Big Data in Hadoop Alan Gates (Hortonworks)
Add to your personal schedule
4:00pm Building Tools for the Hadoop Developer Matt Winkler (Microsoft)
Add to your personal schedule
4:50pm Tricks for Distributed System Debugging and Diagnosis Philip Zeyliger (Cloudera)
Ballroom E
Add to your personal schedule
10:40am Dodging the Digital Creep Factor Shelley Evenson (Fjord)
Add to your personal schedule
4:50pm Strategies for Avoiding Big Privacy “Don’ts” with Personal Data Alysa Z. Hutnik (Kelley Drye & Warren LLP)
Ballroom F
Add to your personal schedule
2:20pm Mapping Hidden Stories Mano Marks (Google, Inc. ), Brendan Kenny (Google)
Add to your personal schedule
4:00pm Zooniverse: Web-scale Citizen Science Arfon Smith (University of Oxford)
Great America Ballroom J
Add to your personal schedule
10:40am Beyond Hadoop MapReduce: Interactive Analytic Insights Using Spark Sharmila Shahani-Mulligan (ClearStory Data), Matei Zaharia (Databricks), Stephanie McReynolds (ClearStory Data)
Add to your personal schedule
11:30am An Introduction to Apache Drill Tomer Shiran (MapR Technologies)
Add to your personal schedule
1:30pm Sketching Techniques for Real-time Big Data Bahman Bahmani (Stanford University)
Add to your personal schedule
2:20pm Real Time Network Analytics with Storm Mauricio Vacas (Accenture Technology Labs), Fausto Inestroza (Accenture Technology Labs), Sonali Parthasarathy (Accenture Technology Labs)
Add to your personal schedule
4:00pm The Future of Relational (or Why You Can't Escape SQL) Tim O'Brien (O'Reilly Media)
Ballroom CD
Add to your personal schedule
11:30am Facet: The Recursive Approach to Visualization Vadim Ogievetsky (Metamarkets)
Add to your personal schedule
1:30pm Designed for Insight: Principles of Big Data Design Douglas van der Molen (ClearStory Data)
Add to your personal schedule
4:00pm Using Web Standards to Create Interactive Data Visualizations Nicolas Garcia Belmonte (Twitter)
Add to your personal schedule
4:50pm Using Every Pixel to Visualize Big Data Lynwood Bishop (Map Large, Inc.)
Ballroom H
Add to your personal schedule
10:40am Expect More From Hadoop Ted Dunning (MapR)
Add to your personal schedule
11:30am Revolutionizing Governance and eDiscovery with Entity Analytics Tim Estes (Digital Reasoning), Brandon Daniels (Clutch Group)
Add to your personal schedule
1:30pm Demonstrating the High-Performance Future of Hadoop Josh Klahr (Pivotal), Gavin Sherry (EMC)
Add to your personal schedule
2:20pm Big Data Use Cases for Different Verticals and Adoption Patterns Anand Venugopal (Impetus Technologies Inc.), Vineet Tyagi (Impetus Technologies)
Add to your personal schedule
4:00pm Learning's Clarion Call: Teaming to Improve US Education with Big Data Science Marie Bienkowski (SRI International), Jace Kohlmeier (Khan Academy), Zachary Pardos (Massachusetts Institute of Technology), Sharren Bates (inBloom)
Add to your personal schedule
4:50pm Leveraging Hadoop Data: High Performance Analytics On ParAccel John Santaferraro (Actian Corporation ), Walt Maguire (ParAccel)
Ballroom G
Add to your personal schedule
2:20pm Power of Declarative Analytics for Big Data Shivakumar Vaithyanathan (IBM)
Add to your personal schedule
4:00pm Implementing Big Data at the Speed of Business Raanan Dagan (Splunk), Rahul Deshmukh (Splunk)
Add to your personal schedule
4:50pm SQL on Hadoop: Defining the New Generation of Analytic Databases Carl Steinbach (Apache Software Organization)
Add to your personal schedule
8:45am Plenary
Room: Mission City Ballroom
Wednesday Keynote Welcome Edd Dumbill (Silicon Valley Data Science), Alistair Croll (Solve For Interesting)
Add to your personal schedule
8:55am Plenary
Room: Mission City Ballroom
Video Games: The Biggest Big Data Challenge Rajat Taneja (Electronic Arts)
Add to your personal schedule
9:05am Plenary
Room: Mission City Ballroom
Hadoop: The Foundation for Change Scott Yara (Greenplum, a division of EMC)
Add to your personal schedule
9:15am Plenary
Room: Mission City Ballroom
Committing to Recommendation Algorithms Eric Colson (Stitch Fix)
Add to your personal schedule
9:25am Plenary
Room: Mission City Ballroom
Hadoop: Big Results John Schroeder (MapR Technologies)
Add to your personal schedule
9:30am Plenary
Room: Mission City Ballroom
Big Data on Small Devices: Data Science goes Mobile Yael Garten (LinkedIn)
Add to your personal schedule
9:40am Plenary
Room: Mission City Ballroom
Using Data to Honor the Human Right to Education Prasad Ram (Gooru)
Add to your personal schedule
9:45am Plenary
Room: Mission City Ballroom
Moneyballing Government Jennifer Pahlka (Code for America)
Add to your personal schedule
9:50am Plenary
Room: Mission City Ballroom
Getting Big Benefits from Big Data Jeanne Harris (Accenture)
6:30pm Break
Room: On Your Own
10:10am Morning Break - Sponsored by inBloom
Room: Expo Hall AB
3:00pm Afternoon Break - Sponsored by SAP
Room: Expo Hall AB
Add to your personal schedule
5:30pm Plenary
Room: Expo Hall AB
Booth Crawl
Add to your personal schedule
12:10pm Lunch - Sponsored by Greenplum
Room: Expo Hall C
Wednesday Lunchtime BoF Tables
8:00am Coffee Break - Sponsored by Basho Technologies
Room: Mission City Ballroom Foyer
Add to your personal schedule
9:00pm Plenary
Room: Hotel Bar, Santa Clara Hyatt
Data Drinkup
10:40am-11:20am (40m) Data Science
Try, Learn, Buy: Operations Research Meets AI
Elisabeth Crawford (Birchbox)
Every month Birchbox delivers a box of samples to each of its subscribers. Boxes are targeted to subscribers based on their profile, history, and behavior. In this talk we discuss the mathematics behind allocating samples to customers (aka solving for happiness).
11:30am-12:10pm (40m) Data Science
Introduction to Forecasting
Michael Bailey (Facebook)
Everyone wants to predict the future; fame and fortune follow those who succeed. I cover the basics of forecasting including tips, tricks, and best practices, and how forecasting differs from prediction analysis. I walk through simple examples using R and link to several resources to put you on the path to becoming the next Nostradamus.
1:30pm-2:10pm (40m) Data Science
Next-Gen Data Scientists
Rachel Schutt (Johnson Research Labs)
Rachel Schutt, Senior Research Scientist at Johnson Research Labs, will discuss her Columbia Data Science course: her motivations for teaching it, how she designed the curriculum, how the NYC tech community was involved, and what impact, if any, she had on her students. She thought about the course as testing the hypothesis: It is possible to incubate awesome data science teams in the classroom.
2:20pm-3:00pm (40m) Beyond Hadoop
The IPython Notebook: a Comprehensive Tool for Data Science
Brian Granger (Cal Poly San Luis Obispo)
In this talk, I will introduce the IPython Notebook, an open-source, web-based interactive computing environment for Python and other languages. By enabling the data scientist to build documents that combine code, text, formulas, visualizations, images and video the Notebook creates a foundation for data science that is interactive, repeatable, documented and sharable.
4:00pm-4:40pm (40m) Data Science
Pervasive Data Munging Gremlins
Bradley Voytek (UCSF & Uber, Inc.)
With more data come more problems. Did you know Excel dates begin on January 1, 1900? Unless you're using the OS X version, then dates begin on January 1, 1904. Or Unix time, which begins January 1, 1970. These pervasive, easily-overlooked gremlins are the bane of any data scientist and in this session I will explore a variety of these little nuisances.
4:50pm-5:30pm (40m) Data Science
What To Do When Your Machine Learning Gets Attacked
Vishwanath Ramarao (Impermium)
Classic data science problems involve finding stationary patterns in big datasets. However, in adversarial settings, enemies deliberately shift their approach to avoid detection. They can challenge learning systems by randomizing behavior, hiding tracks, lacing traffic and more. Successful application of machine learning requires new approaches to feature engineering, training and classification.
10:40am-11:20am (40m) Hadoop in Practice
Unlocking the “Power” of Big Data: Analyzing Energy Consumption Across 50 Million U.S. Households
Barry Fischer (Opower)
Opower, the global leader in the field of energy information and analysis, works with 80 utility companies worldwide to give families context, insights, and advice about how to save energy. With access to an unprecedented (and still growing) amount of energy data—currently drawn from 50 million US homes—Opower is uncovering unique trends in how people are using energy at home.
11:30am-12:10pm (40m) Hadoop in Practice
Funnel Analysis in Hadoop at Etsy
Matt Walker (Etsy) et al
As an ecommerce site with more than 800,000 different sellers, Etsy is particularly interested in understanding how shoppers find the items they seek. This talk will discuss the challenges of funnel analysis at Etsy, the corresponding deficiencies of several widely used web analytics tools, and our event sequence matching tool implemented in Hadoop.
1:30pm-2:10pm (40m) Hadoop in Practice
Building Scalable Big Data Infrastructure Using Open Source Software
Sam William (Stumbleupon Inc)
The Infrastructure team at Stumbleupon leverages the state of the art tools and technologies to build platforms that enable us collect, categorize, organize, store and analyze huge volumes of data. The platform is fast and robust that it adds minimal latency to the site.Timely collection and analysis of data helps data scientists, analysts and executives make the best decisions and validate them.
2:20pm-3:00pm (40m) Hadoop in Practice
Coordinating the Many Tools of Big Data in Hadoop
Alan Gates (Hortonworks)
Big Data is about more than petabytes; it is also about new paradigms, languages, and tools. This talk will cover work going on in Hadoop projects to coordinate sharing of data and user code between tools.
4:00pm-4:40pm (40m) Hadoop in Practice
Building Tools for the Hadoop Developer
Matt Winkler (Microsoft)
In this session we’ll first discuss our experience extending Hadoop development to new platforms & languages and then discuss our experiments and experiences building supporting developer tools and plugins for those platforms.
4:50pm-5:30pm (40m) Hadoop in Practice
Tricks for Distributed System Debugging and Diagnosis
Philip Zeyliger (Cloudera)
All is quiet on the log file front, but yet the system is down. What next? Three parts practical know-how (“here’s my toolbox”) and one part position paper (“must-haves for comprehensibility”), this talk will cover the tricks of the trade for debugging distributed systems. Motivated by experience gained diagnosing Hadoop, we’ll dig into the JVM, Linux esoterica, and outlier visualization.
10:40am-11:20am (40m) Law, Ethics, and Open Data
Dodging the Digital Creep Factor
Shelley Evenson (Fjord)
In today's world, decisions are made for us based on data. On one hand, this is appealing, but on the other hand its disorienting. To address this, designers need to focus on the things that make us uniquely human and focus on the translation between the abstract and human. This presentation will look at the ways humans make decisions and how big data and technology can enable this, not lead it.
11:30am-12:10pm (40m) Law, Ethics, and Open Data
Sci vs. Sci: Attack Vectors for Black-hat Data Scientists, and Possible Countermeasures
Joseph Turian (MetaOptimize)
When a data scientist crosses over to the dark side, look out. High-quality spam, large-scale CAPTCHA-breaking, impolite spiders, oh my! This talk will explore attack vectors that can be exploited by black-hat data scientists. We'll also discuss countermeasures and defenses that are available to the good guys, and assess their effectiveness.
1:30pm-2:10pm (40m) Law, Ethics, and Open Data
Who is Fake? Discover Astroturfing or Attempts of Fake Influence!
Lutz Finger (Fisheye Analytics)
From politicians to marketers everyone tries to influence. Data analytics of traditional as well as social media data has made it easier to spot deliberate attempts to skew the public opinion. The talk will give insights into new measurements by analyzing large events such as the London Olympics. Those measures will help to disguise the more and more sophisticated attempts of fake influence.
2:20pm-3:00pm (40m) Law, Ethics, and Open Data
How to Crowdsource Large Scale Identity Theft and Fraud to Make Bucket Loads of Easy Money
Jo Prichard (LexisNexis Risk Solutions)
This session will demonstrate to attendees how easy it is to crowdsource identity theft to commit fraud and make money. We will look at which segments of the population are easy targets for large scale identity fraud. Attendees will be given methodologies to combat this type of fraud leveraging Big Data and various technologies.
4:00pm-4:40pm (40m) Law, Ethics, and Open Data
Big Data is a Hotbed of Thoughtcrime, Part II: The Code
Jim Adler (inome)
At Strata 2012 in New York, we discussed the hazards of curbing big data inferences by defining a new category of thoughtcrime. After all, acting on thoughts might constitute a crime, but thoughts, in isolation, cannot be criminal. It's time to go deeper. Let's create and evaluate a predictive criminal model that highlights where the sensitivities lie, both technically and ethically.
4:50pm-5:30pm (40m) Law, Ethics, and Open Data
Strategies for Avoiding Big Privacy “Don’ts” with Personal Data
Alysa Z. Hutnik (Kelley Drye & Warren LLP)
Privacy laws as to a company’s obligations on data collection, use, disclosure are changing rapidly. Failing to understand how the laws affect a company’s personal data assets can result in media exposes, regulatory investigations, Congressional hearings and lawsuits. This session will provide guidance on “privacy by design” compliance and practical tips to avoid becoming a target of scrutiny.
10:40am-11:20am (40m) Connected World
Location Intelligence Targets Information for Development
Stewart Collis (aWhere Inc.)
Location Intelligence (LI) transforms how public health and agriculture initiatives are managed and monitored by translating big complex data from multiple sources and varying temporal and spatial scales into local, actionable insight. This empowers national governments and global development organizations to focus on saving lives and building healthy, sustainable communities.
11:30am-12:10pm (40m) Connected World
Public Health Case Study: Tracking Zombies and Vampires using Social Media
John Feland (Argus Insights)
Prepare for the coming zombie apocalypse or subjugation by our vampire overlords by tracking the spread of these threats and understand the characteristics of the populations already infected using a combination of social media analytics and classic market research cluster analysis. Learn about new methods for unpacking consumer conversations and tracking true attitudinal consumer segments.
1:30pm-2:10pm (40m) Connected World
Real-time Big Analytics Use Cases: Babies, Brains, and Buses
Nagui Halim (IBM)
Hadoop is great for analyzing data at rest. But what if your business problem requires the ability to analyze and respond in real-time and without a human in the loop?
2:20pm-3:00pm (40m) Connected World
Mapping Hidden Stories
Mano Marks (Google, Inc. ) et al
The world of mapping is undergoing another revolution. New techniques for visualizing and querying increasingly large amounts of data can lead to new ways of interacting with and discovering meaning in your data. In this session, we'll talk about the latest in vector mapping and how you can use it to explore the hidden stories in your data.
4:00pm-4:40pm (40m) Connected World
Zooniverse: Web-scale Citizen Science
Arfon Smith (University of Oxford)
Dealing with the flood of data that confronts researchers is the fundamental challenge of 21st century research. Citizen Science has allowed researchers within the Zooniverse to take on research problems at a scale impossible without the attention of a large community of volunteers.
4:50pm-5:30pm (40m) Connected World
Bigger Than Any One -- Solving Large-Scale Data Problems with People and Machines
Tyler Bell (Factual)
Factual believes that some data problems are bigger than any one company. This talk describes how Factual combines both machines and other (human) data communities to their best effect, within the context of similar data-centric, community-driven applications.
10:40am-11:20am (40m) Beyond Hadoop
Beyond Hadoop MapReduce: Interactive Analytic Insights Using Spark
Sharmila Shahani-Mulligan (ClearStory Data) et al
AMPLab’s open source data analysis projects, Spark and Shark, deliver iterative queries up to 100x faster than Hadoop MapReduce. Hear how companies are using Spark-based data platforms for fast, interactive analysis on big data.
11:30am-12:10pm (40m) Beyond Hadoop
An Introduction to Apache Drill
Tomer Shiran (MapR Technologies)
This session is an overview of Apache Drill, another big data system inspired by a Google white paper.
1:30pm-2:10pm (40m) Beyond Hadoop
Sketching Techniques for Real-time Big Data
Bahman Bahmani (Stanford University)
In many modern web and big data applications the data arrives in a streaming fashion and needs to be processed on the fly. Due to the size of data, the computations need to be done incrementally, and hence sketches of data are used that take a small amount of memory but allow for fast updates and queries. We will present the techniques to design these sketches and provide clarifying examples.
2:20pm-3:00pm (40m) Beyond Hadoop
Real Time Network Analytics with Storm
Mauricio Vacas (Accenture Technology Labs) et al
With the growth in volume and velocity of data, businesses need a scalable solution alongside batch processing to process events on the fly and provide real time insights. In this session, we will describe how we used Storm to analyze network data to detect causes of network performance degradation.
4:00pm-4:40pm (40m) Beyond Hadoop
The Future of Relational (or Why You Can't Escape SQL)
Tim O'Brien (O'Reilly Media)
While the industry has been busy abandoning the relational database and calling it a fundamentally limited technology, several trends are conspiring to revive the good old RDBMS. While it might not resemble the MySQL or Oracle database you are running today, this talk will explore how hardware trends, software trends, and industry research are point to SQL, structure, and ACID at scale.
4:50pm-5:30pm (40m) Beyond Hadoop
Petascale Processing On An Open Source Budget: An Introduction to QFS
Jim Kelly (Quantcast)
This talk introduces an open-source distributed file system that will double the capacity of your Hadoop cluster and speed up your MapReduce jobs. The talk will describe the Reed-Solomon implementation and its implications for cluster performance, how it leverages the speed of modern networks to achieve better storage efficiency and make Hadoop jobs run faster.
10:40am-11:20am (40m) Design
Agile Data Wrangling and Web-based Visualizations
Chang She (DataPad)
While many libraries are available today to help create interactive visualizations, they are generally not integrated with the data analysis tool chain. This talk will focus on how to combine agile data manipulations with web-based visualization libraries to create a more efficient workflow for data science.
11:30am-12:10pm (40m) Design
Facet: The Recursive Approach to Visualization
Vadim Ogievetsky (Metamarkets)
Visualization is a powerful way to understand data, but today building the right data set and accompanying data visualization requires sophisticated programming skills. We discuss an approach to a unified language describing both visualization and database queries. This approach could be used by both programmers and business users, accelerating data exploration and speeding time to insight.
1:30pm-2:10pm (40m) Design
Designed for Insight: Principles of Big Data Design
Douglas van der Molen (ClearStory Data)
Whether the user is a business user or an IT user, with today's data complexity, there are a number of design principles that are key to achieving success. Hear how to approach product designing for today's data challenges and meet new user expectations for fast and timely insights at scale.
2:20pm-3:00pm (40m) Design
Data Visualization Design Using Shneiderman’s Mantra: Overview First, Zoom and Filter, Then Details-on-Demand
Eric Legrand (Wells Fargo) et al
This session explores applications of Shneiderman’s mantra for visual data analysis (overview first, zoom and filter, then details-on-demand) as a framework in the context of three complex analytical applications at Wells Fargo: (1) Analytics process, (2) Interactive meeting facilitation and (3) Dashboard design.
4:00pm-4:40pm (40m) Design
Using Web Standards to Create Interactive Data Visualizations
Nicolas Garcia Belmonte (Twitter)
From markup languages like SVG to OpenGL based APIs like WebGL, the browser provides several ways for creating visualizations. In this talk we'll show some web based visualizations we worked on for different projects and for Twitter, and show what standards were used to create them. We'll dissect each example showing what was used not only for rendering but also for data handling and interaction.
4:50pm-5:30pm (40m) Design
Using Every Pixel to Visualize Big Data
Lynwood Bishop (Map Large, Inc.)
The human eye can detect infinitesimal patterns in the world around us. Shouldn’t we make use of this amazing skill when recognizing patterns or detecting anomalies in big data? In this session we’ll explore why rendering every pixel is a challenge with big data and look at how these limitations can be overcome.
10:40am-11:20am (40m) Sponsored Sessions
Expect More From Hadoop
Ted Dunning (MapR)
As enterprises deploy Hadoop, it’s not the volume or velocity of data that is problematic, but the variety of types and formats of their critical data. This session discusses how leading companies have integrated Hadoop, NoSQL (HBase) and enterprise sources on one platform. Data is combined and processed in one simplified architecture. Case studies and reference architectures will be reviewed.
11:30am-12:10pm (40m) Sponsored Sessions
Revolutionizing Governance and eDiscovery with Entity Analytics
Tim Estes (Digital Reasoning) et al
Given the exponential rise in data, attorneys have an obligation to meet today’s Governance, Risk and Compliance (GRC) challenges and stay on top of technology in order to achieve broader institutional benefits. Join Digital Reasoning and the Clutch Group to learn how moving from document-centric to entity-centric analytics is key in gaining valuable knowledge from unstructured information.
1:30pm-2:10pm (40m) Sponsored Sessions
Demonstrating the High-Performance Future of Hadoop
Josh Klahr (Pivotal) et al
The emergence of Apache Hadoop over the past few years has required organizations to completely rethink architectures that have been in place for decades. And with changes in the underlying data fabric, come ripple effects, and often bottlenecks, that impact all levels of an organization both business and technical.
2:20pm-3:00pm (40m) Sponsored Sessions
Big Data Use Cases for Different Verticals and Adoption Patterns
Anand Venugopal (Impetus Technologies Inc.) et al
2012 was particularly interesting for the variety of Big Data use-cases implemented. This session explores key patterns across horizontal and vertical use cases.
4:00pm-4:40pm (40m) Sponsored Sessions
Learning's Clarion Call: Teaming to Improve US Education with Big Data Science
Marie Bienkowski (SRI International) et al
This panel will share insights on how K-16 education can benefit from developments in Big Data ecosystems.
4:50pm-5:30pm (40m) Sponsored Sessions
Leveraging Hadoop Data: High Performance Analytics On ParAccel
John Santaferraro (Actian Corporation ) et al
ParAccel runs analytic queries 100x faster than Hive with much deeper SQL Support. Hear how companies are using analytic platforms for fast, interactive analysis on big data.
10:40am-11:20am (40m) Sponsored Sessions
How to Transform your Business by Choosing the Right Big Data Stack
Billy Bosworth (DataStax)
Discussion of how big data is impacting modern business, which market trends are driving the adoption of big data solutions, and how big data professionals can choose the right technology to transform their business.
11:30am-12:10pm (40m) Sponsored Sessions
Strengthening the Bond Between Hadoop and Your Analytic Database
Joydeep Das (SAP)
Opposites attract and that’s the case with Hadoop and analytic databases. Both have a role to play in your Big Data projects. This session explores the various approaches to cementing the bond between Hadoop to your analytic database, how SAP customers are integrating Hadoop into BI and advanced analytic environments, and why you’ll want to do that too.
1:30pm-2:10pm (40m) Sponsored Sessions
The Hadoop Data Reservoir - Requirements and Pitfalls
Peter Schlampp (Platfora)
Enterprises are moving forward with the vision of creating a central repository of all enterprise data stored inexpensively and processed efficiently in Hadoop. Only a fraction have yet been successful. This session will explore the pitfalls of implementing the Hadoop Data Reservoir and the requirements that lead to success.
2:20pm-3:00pm (40m) Sponsored Sessions
Power of Declarative Analytics for Big Data
Shivakumar Vaithyanathan (IBM)
Learn first-hand how advanced analytics are enabling modern enterprises to deal with big data challenges.
4:00pm-4:40pm (40m) Sponsored Sessions
Implementing Big Data at the Speed of Business
Raanan Dagan (Splunk) et al
In this talk, we'll examine compelling, real-world examples that offer a blueprint for integrating big data technologies, delivering rapid visibility and insights to IT professionals, data analysts and business users, and that accelerate the adoption of big data in the enterprise.
4:50pm-5:30pm (40m) Sponsored Sessions
SQL on Hadoop: Defining the New Generation of Analytic Databases
Carl Steinbach (Apache Software Organization)
This talk is about the emergence of a new class of analytic databases based on principles first popularized by Google Dremel. These systems have been designed with the goal of enabling real-time SQL on Hadoop, while also supporting schema-on-read, semi-structured data, and pluggable storage engines. In this talk we will explain the novel architectural features that make these goals a reality.
8:45am-8:55am (10m)
Wednesday Keynote Welcome
Edd Dumbill (Silicon Valley Data Science) et al
Strata Program Chairs, Edd Dumbill and Alistair Croll, welcome you to the first day of keynotes.
8:55am-9:05am (10m)
Video Games: The Biggest Big Data Challenge
Rajat Taneja (Electronic Arts)
In this talk, EA CTO Rajat Taneja will dive in to the challenges and complexities facing the gaming industry, how to harness the power of data and share examples of how technologies like machine learning and predictive analytics have been put in place to improve the customer experience.
9:05am-9:15am (10m) Sponsored Sessions
Hadoop: The Foundation for Change
Scott Yara (Greenplum, a division of EMC)
Hadoop is the engine powering the Big Data era, an unstoppable force boasting massive investments and a rich ecosystem. But this is only the beginning: Hadoop has the potential to reach beyond Big Data and become the Foundation for Change, catalyzing new levels of business productivity and transformation. Hadoop will become the Foundation for Change.
9:15am-9:25am (10m)
Committing to Recommendation Algorithms
Eric Colson (Stitch Fix)
Many companies have figured out how to generate incremental value through the use of recommendation engines. As such, the underlying algorithms are considered a valuable asset. But what happens when a company’s entire business model rests on its ability to get relevant products in front of the customer? When this happens you see a massive commitment to algorithms, data, and data scientists.
9:25am-9:30am (5m) Sponsored Sessions
Hadoop: Big Results
John Schroeder (MapR Technologies)
The excitement about Big Data stems from the results: the impact on revenue, the decrease in costs, the Big gains in competitive advantage that result from Hadoop and HBase applications. This keynote provides insights into how the combination of scale, efficiency and analytic flexibility creates the power to expand the applications for Hadoop to transform companies as well as entire industries.
9:30am-9:40am (10m)
Big Data on Small Devices: Data Science goes Mobile
Yael Garten (LinkedIn)
Data science for consumer internet products relies on our ability to effectively analyze and understand ubiquitous computing in terms of a holistic product experience, as individuals consume and create data on mobile and desktop devices in their day-to-day lives. I'll talk about mobile data science challenges — from product development to data-driven decision making.
9:40am-9:45am (5m) Sponsored Sessions
Using Data to Honor the Human Right to Education
Prasad Ram (Gooru)
More than ever before, students are using the Internet to study, leaving behind a trail of valuable data. How can we leverage this data to improve education?
9:45am-9:50am (5m)
Moneyballing Government
Jennifer Pahlka (Code for America)
Code for America fellows have been tackling not only the promise of data in America’s cities, but the reality of the challenges, for the past two years. In February 2013, six new fellows will be working on our hardest problem yet: using data to unclog the criminal justice system in Louisville and New York City. If the public sector can innovate using data, and results benefit us all.
9:50am-10:05am (15m)
Getting Big Benefits from Big Data
Jeanne Harris (Accenture)
How must big companies evolve in order to realize big value from big data? Investing in data, technology and data scientists is just a first step.
6:30pm-9:00pm (2h 30m)
Break
10:10am-10:40am (30m)
Break: Morning Break - Sponsored by inBloom
3:00pm-4:00pm (1h)
Break: Afternoon Break - Sponsored by SAP
5:30pm-6:30pm (1h)
Booth Crawl
Quench your thirst with vendor-hosted libations and snacks while you check out all the cool stuff in the Expo Hall.
12:10pm-1:30pm (1h 20m)
Wednesday Lunchtime BoF Tables
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Wed 2/27 and Thu 2/28. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area.
8:00am-8:45am (45m)
Break: Coffee Break - Sponsored by Basho Technologies
9:00pm-10:00pm (1h)
Data Drinkup
A casual get-together for conference-goers after a busy day at Strata

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts