Tuesday, 02/26/2013

8:00am

Ballroom DE Foyer
Coffee Break - Sponsored by NetApp (1h)

9:00am

Add to your personal schedule
Data Science Ballroom AB
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
William Cukierski (Kaggle), Ben Hamner (Kaggle)
Average rating: ***..
(3.67, 12 ratings)
As more industries adopt data-driven policies, people untrained in the formal analysis of data are find themselves staring at a spreadsheet and asking what they did to deserve it. In this tutorial, two of Kaggle’s top data scientists will walk attendees through the basics of solving an analytics challenge, from defining the problem, to performing basic analysis, to visualizing the output. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Jonathan Hsieh (Cloudera, Inc), Himanshu Vashishtha (Cloudera, Inc.)
Average rating: ***..
(3.12, 16 ratings)
HBase is one of the more popular open source NoSQL databases that have cropped up over the last few years. Building applications that use HBase effectively is challenging. This tutorial is geared towards teaching the basics of building applications using HBase and covers concepts that a developer should know while using HBase as a backend store for their application. Read more.
Add to your personal schedule
Hadoop in Practice Ballroom E
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Dean Wampler (Typesafe)
Average rating: ****.
(4.69, 13 ratings)
This hands-on tutorial teaches you how to use Hive, a high-level, data warehouse tool for Hadoop. Hive provides a SQL-like query language, HiveQL, that is easy to learn for people with prior SQL experience, making Hive attractive for data warehousing teams. Hive leverages the power of Hadoop for working with massive data sets without requiring expertise in MapReduce programming. Read more.
Add to your personal schedule
Data Science Ballroom F
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Garrett Grolemund (RStudio)
Average rating: ****.
(4.38, 8 ratings)
Learn how to wrangle data in R: from acquiring and cleaning data, to changing data formats and performing targeted, groupwise calculations. This course will emphasize the 'reshape2' and 'plyr' packages. Read more.
Add to your personal schedule
Beyond Hadoop Ballroom G
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Ion Stoica (UC Berkeley), Matei Zaharia (Databricks), Reynold Xin (Databricks), Shivaram Venkataraman (UC Berkeley), Andy Konwinski (UC Berkeley), Tathagata Das (University of California Berkeley)
Average rating: *****
(5.00, 3 ratings)
An introduction Spark and Shark, two components of the open-source Berkeley Data Analytics Stack (BDAS) in development at UC Berkeley. Spark is a high-speed cluster computing system compatible with Hadoop that can outperform it by up to 100x. Shark is a port of Apache Hive onto Spark that is fully compatible with, and up to 100x faster than, Hive. Read more.
Add to your personal schedule
Hadoop in Practice Ballroom H
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Karan Bhatia (Amazon Web Services), Parviz Deyhim (Amazon Web Services)
Average rating: ***..
(3.25, 4 ratings)
This hands-on tutorial will give you on an overview of how AWS can quickly and easily enable you to start generating insights from your company’s data. Read more.
Add to your personal schedule
Enterprise IT Great America Ballroom J
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Edd Dumbill (Silicon Valley Data Science)
Average rating: ****.
(4.33, 3 ratings)
For CIOs, IT executives, and technology professionals, Strata's Enterprise Big Data day lays out the roadmap to get your organization up to speed on big data. In this all-day event, hear how to create a big data strategy, understand the issues of managing data, and learn how data science can be used powerfully in your organization. Read more.
Add to your personal schedule
DDBD Ballroom CD
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Alistair Croll (Solve For Interesting)
Average rating: ***..
(3.75, 4 ratings)
For business strategists, marketers, product managers, and entrepreneurs, Data Driven Business looks at how to use data to make better business decisions faster. Packed with case studies, panels, and eye-opening presentations, this fast-paced day focuses on how to solve today's thorniest business problems with Big Data. It's the missing MBA for a data-driven, always-on business world. Read more.
Add to your personal schedule
SOLD OUT
Design Room 204
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Average rating: ****.
(4.67, 3 ratings)
Communicating Data Clearly describes how to draw clear, concise, accurate graphs that are easier to understand than many of the graphs one sees today. The tutorial emphasizes how to avoid common mistakes that produce confusing or even misleading graphs. Graphs for one, two, three, and many variables are covered as well as general principles for creating effective graphs. Read more.

1:30pm

Add to your personal schedule
Beyond Hadoop Ballroom AB
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Ryan Tabora (Think Big Analytics), Jason Rutherglen (Datastax)
Average rating: **...
(2.31, 13 ratings)
In this hands-on tutorial, you will learn the importance of distributed search by our industry experience and knowledge of real use cases. We’ll introduce different architectures that incorporate distributed search techniques, share pain points experienced and lessons learned. For the hands-on part of the tutorial, you will learn how to install and use Apache Solr for real-time search on big data. Read more.
Add to your personal schedule
Design Great America Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Scott Murray (University of San Francisco), Jerome Cukier (Jerome Cukier)
Average rating: ***..
(3.00, 5 ratings)
An introduction to D3, one of the most powerful Javascript data visualization libraries. Read more.
Add to your personal schedule
Data Science Ballroom E
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Simon Rogers (Guardian), Feilding Cage (Guardian)
Average rating: ****.
(4.67, 6 ratings)
This hands-on session will show how a dataset turns into a story, the narrative process the Guardian's team goes through, the tools used and the lessons learned. Read more.
Add to your personal schedule
Data Science Ballroom F
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Wes McKinney (DataPad Inc.)
Average rating: ***..
(3.88, 8 ratings)
This tutorial will be a hands-on introduction to the essential tools for working with structured data in Python, 'pandas' and 'NumPy' Read more.
Add to your personal schedule
Data Science Ballroom G
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Sarah Sproehnle (Cloudera, Inc.)
Average rating: ****.
(4.71, 7 ratings)
This tutorial provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop, plus its associated ecosystem. This session is intended for those who are new to Hadoop and are seeking to understand where Hadoop is appropriate and how it fits with existing systems. No programming experience is required. Read more.
Add to your personal schedule
Beyond Hadoop, Data Science Ballroom H
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Ryan Boyd (Google), Michael Manoochehri (Google, Inc.), Julia Ferraioli (Google, Inc.)
Average rating: **...
(2.57, 7 ratings)
When data volume and velocity become massive, processing and analysis solutions require specialized technologies for different parts of the data pipeline. Google’s Cloud Platform is designed to help you focus on building applications, not infrastructure. We’ll demonstrate how to build end to end Big Data applications - from data collection, to analysis, to reporting and visualization. Read more.
Add to your personal schedule
Beyond Hadoop Room 204
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Matei Zaharia (Databricks), Reynold Xin (Databricks), Andy Konwinski (UC Berkeley), Tathagata Das (University of California Berkeley), Patrick Wendell (Databricks)
Average rating: ****.
(4.00, 1 rating)
Building on our previous tutorial introducing BDAS, the open-source Berkeley Data Analytics Stack, in this tutorial we will provide each audience member with a Spark/Shark cluster on EC2 and walk through hands-on coding examples. Lessons will cover the Spark and Shark command line interfaces, writing a standalone program, and data clustering using a distributed machine learning algorithm on Spark. Read more.

5:00pm

Add to your personal schedule
Expo Hall AB
Grab a drink, mingle with fellow Strata participants, and see the latest technologies and products from leading companies in the data space. Read more.

6:00pm

Add to your personal schedule
Mission City Ballroom
Don't miss Startup Showcase, Strata's live demo program and competition for startups and early-stage companies. The judges will pick winners from 10 finalist companies selected to present at the showcase. Read more.

Wednesday, 02/27/2013

8:45am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Edd Dumbill (Silicon Valley Data Science), Alistair Croll (Solve For Interesting)
Average rating: ****.
(4.00, 6 ratings)
Strata Program Chairs, Edd Dumbill and Alistair Croll, welcome you to the first day of keynotes. Read more.

8:55am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Rajat Taneja (Electronic Arts)
Average rating: ****.
(4.12, 17 ratings)
In this talk, EA CTO Rajat Taneja will dive in to the challenges and complexities facing the gaming industry, how to harness the power of data and share examples of how technologies like machine learning and predictive analytics have been put in place to improve the customer experience. Read more.

9:05am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Scott Yara (Greenplum, a division of EMC)
Average rating: ***..
(3.29, 14 ratings)
Hadoop is the engine powering the Big Data era, an unstoppable force boasting massive investments and a rich ecosystem. But this is only the beginning: Hadoop has the potential to reach beyond Big Data and become the Foundation for Change, catalyzing new levels of business productivity and transformation. Hadoop will become the Foundation for Change. Read more.

9:15am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Eric Colson (Stitch Fix)
Average rating: ****.
(4.29, 21 ratings)
Many companies have figured out how to generate incremental value through the use of recommendation engines. As such, the underlying algorithms are considered a valuable asset. But what happens when a company’s entire business model rests on its ability to get relevant products in front of the customer? When this happens you see a massive commitment to algorithms, data, and data scientists. Read more.

9:25am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
John Schroeder (MapR Technologies)
Average rating: **...
(2.50, 14 ratings)
The excitement about Big Data stems from the results: the impact on revenue, the decrease in costs, the Big gains in competitive advantage that result from Hadoop and HBase applications. This keynote provides insights into how the combination of scale, efficiency and analytic flexibility creates the power to expand the applications for Hadoop to transform companies as well as entire industries. Read more.

9:30am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Yael Garten (LinkedIn)
Average rating: ***..
(3.90, 20 ratings)
Data science for consumer internet products relies on our ability to effectively analyze and understand ubiquitous computing in terms of a holistic product experience, as individuals consume and create data on mobile and desktop devices in their day-to-day lives. I'll talk about mobile data science challenges — from product development to data-driven decision making. Read more.

9:40am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Prasad Ram (Gooru)
Average rating: ***..
(3.31, 13 ratings)
More than ever before, students are using the Internet to study, leaving behind a trail of valuable data. How can we leverage this data to improve education? Read more.

9:45am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Jennifer Pahlka (Code for America)
Average rating: ****.
(4.16, 19 ratings)
Code for America fellows have been tackling not only the promise of data in America’s cities, but the reality of the challenges, for the past two years. In February 2013, six new fellows will be working on our hardest problem yet: using data to unclog the criminal justice system in Louisville and New York City. If the public sector can innovate using data, and results benefit us all. Read more.

9:50am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Jeanne Harris (Accenture)
Average rating: ***..
(3.62, 13 ratings)
How must big companies evolve in order to realize big value from big data? Investing in data, technology and data scientists is just a first step. Read more.

10:40am

Add to your personal schedule
Data Science Ballroom AB
Elisabeth Crawford (Birchbox)
Average rating: ****.
(4.00, 4 ratings)
Every month Birchbox delivers a box of samples to each of its subscribers. Boxes are targeted to subscribers based on their profile, history, and behavior. In this talk we discuss the mathematics behind allocating samples to customers (aka solving for happiness). Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Barry Fischer (Opower)
Average rating: **...
(2.67, 3 ratings)
Opower, the global leader in the field of energy information and analysis, works with 80 utility companies worldwide to give families context, insights, and advice about how to save energy. With access to an unprecedented (and still growing) amount of energy data—currently drawn from 50 million US homes—Opower is uncovering unique trends in how people are using energy at home. Read more.
Add to your personal schedule
Shelley Evenson (Fjord)
Average rating: ****.
(4.17, 6 ratings)
In today's world, decisions are made for us based on data. On one hand, this is appealing, but on the other hand its disorienting. To address this, designers need to focus on the things that make us uniquely human and focus on the translation between the abstract and human. This presentation will look at the ways humans make decisions and how big data and technology can enable this, not lead it. Read more.
Add to your personal schedule
Connected World Ballroom F
Stewart Collis (aWhere Inc.)
Average rating: ***..
(3.00, 2 ratings)
Location Intelligence (LI) transforms how public health and agriculture initiatives are managed and monitored by translating big complex data from multiple sources and varying temporal and spatial scales into local, actionable insight. This empowers national governments and global development organizations to focus on saving lives and building healthy, sustainable communities. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Billy Bosworth (DataStax)
Average rating: *....
(1.33, 3 ratings)
Discussion of how big data is impacting modern business, which market trends are driving the adoption of big data solutions, and how big data professionals can choose the right technology to transform their business. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Ted Dunning (MapR)
Average rating: ****.
(4.00, 1 rating)
As enterprises deploy Hadoop, it’s not the volume or velocity of data that is problematic, but the variety of types and formats of their critical data. This session discusses how leading companies have integrated Hadoop, NoSQL (HBase) and enterprise sources on one platform. Data is combined and processed in one simplified architecture. Case studies and reference architectures will be reviewed. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Sharmila Shahani-Mulligan (ClearStory Data), Matei Zaharia (Databricks), Stephanie McReynolds (ClearStory Data)
Average rating: ****.
(4.00, 2 ratings)
AMPLab’s open source data analysis projects, Spark and Shark, deliver iterative queries up to 100x faster than Hadoop MapReduce. Hear how companies are using Spark-based data platforms for fast, interactive analysis on big data. Read more.
Add to your personal schedule
Design Ballroom CD
Chang She (DataPad)
Average rating: ***..
(3.30, 10 ratings)
While many libraries are available today to help create interactive visualizations, they are generally not integrated with the data analysis tool chain. This talk will focus on how to combine agile data manipulations with web-based visualization libraries to create a more efficient workflow for data science. Read more.

11:30am

Add to your personal schedule
Data Science Ballroom AB
Michael Bailey (Facebook)
Average rating: ****.
(4.25, 16 ratings)
Everyone wants to predict the future; fame and fortune follow those who succeed. I cover the basics of forecasting including tips, tricks, and best practices, and how forecasting differs from prediction analysis. I walk through simple examples using R and link to several resources to put you on the path to becoming the next Nostradamus. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Matt Walker (Etsy), Wil Stuckey (Etsy), Steve Mardenfeld (etsy)
Average rating: ****.
(4.00, 2 ratings)
As an ecommerce site with more than 800,000 different sellers, Etsy is particularly interested in understanding how shoppers find the items they seek. This talk will discuss the challenges of funnel analysis at Etsy, the corresponding deficiencies of several widely used web analytics tools, and our event sequence matching tool implemented in Hadoop. Read more.
Add to your personal schedule
Joseph Turian (MetaOptimize)
Average rating: ****.
(4.25, 4 ratings)
When a data scientist crosses over to the dark side, look out. High-quality spam, large-scale CAPTCHA-breaking, impolite spiders, oh my! This talk will explore attack vectors that can be exploited by black-hat data scientists. We'll also discuss countermeasures and defenses that are available to the good guys, and assess their effectiveness. Read more.
Add to your personal schedule
Connected World Ballroom F
John Feland (Argus Insights)
Average rating: ***..
(3.00, 5 ratings)
Prepare for the coming zombie apocalypse or subjugation by our vampire overlords by tracking the spread of these threats and understand the characteristics of the populations already infected using a combination of social media analytics and classic market research cluster analysis. Learn about new methods for unpacking consumer conversations and tracking true attitudinal consumer segments. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Joydeep Das (SAP)
Average rating: **...
(2.00, 1 rating)
Opposites attract and that’s the case with Hadoop and analytic databases. Both have a role to play in your Big Data projects. This session explores the various approaches to cementing the bond between Hadoop to your analytic database, how SAP customers are integrating Hadoop into BI and advanced analytic environments, and why you’ll want to do that too. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Tim Estes (Digital Reasoning), Brandon Daniels (Clutch Group)
Given the exponential rise in data, attorneys have an obligation to meet today’s Governance, Risk and Compliance (GRC) challenges and stay on top of technology in order to achieve broader institutional benefits. Join Digital Reasoning and the Clutch Group to learn how moving from document-centric to entity-centric analytics is key in gaining valuable knowledge from unstructured information. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Tomer Shiran (MapR Technologies)
This session is an overview of Apache Drill, another big data system inspired by a Google white paper. Read more.
Add to your personal schedule
Design Ballroom CD
Vadim Ogievetsky (Metamarkets)
Average rating: ***..
(3.00, 3 ratings)
Visualization is a powerful way to understand data, but today building the right data set and accompanying data visualization requires sophisticated programming skills. We discuss an approach to a unified language describing both visualization and database queries. This approach could be used by both programmers and business users, accelerating data exploration and speeding time to insight. Read more.

12:10pm

Add to your personal schedule
Expo Hall C
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Wed 2/27 and Thu 2/28. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area. Read more.

1:30pm

Add to your personal schedule
Data Science Ballroom AB
Rachel Schutt (Johnson Research Labs)
Average rating: ***..
(3.45, 11 ratings)
Rachel Schutt, Senior Research Scientist at Johnson Research Labs, will discuss her Columbia Data Science course: her motivations for teaching it, how she designed the curriculum, how the NYC tech community was involved, and what impact, if any, she had on her students. She thought about the course as testing the hypothesis: It is possible to incubate awesome data science teams in the classroom. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Sam William (Stumbleupon Inc)
Average rating: ***..
(3.57, 7 ratings)
The Infrastructure team at Stumbleupon leverages the state of the art tools and technologies to build platforms that enable us collect, categorize, organize, store and analyze huge volumes of data. The platform is fast and robust that it adds minimal latency to the site.Timely collection and analysis of data helps data scientists, analysts and executives make the best decisions and validate them. Read more.
Add to your personal schedule
Lutz Finger (Fisheye Analytics)
Average rating: ***..
(3.86, 7 ratings)
From politicians to marketers everyone tries to influence. Data analytics of traditional as well as social media data has made it easier to spot deliberate attempts to skew the public opinion. The talk will give insights into new measurements by analyzing large events such as the London Olympics. Those measures will help to disguise the more and more sophisticated attempts of fake influence. Read more.
Add to your personal schedule
Connected World Ballroom F
Nagui Halim (IBM)
Average rating: ***..
(3.67, 3 ratings)
Hadoop is great for analyzing data at rest. But what if your business problem requires the ability to analyze and respond in real-time and without a human in the loop? Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Peter Schlampp (Platfora)
Enterprises are moving forward with the vision of creating a central repository of all enterprise data stored inexpensively and processed efficiently in Hadoop. Only a fraction have yet been successful. This session will explore the pitfalls of implementing the Hadoop Data Reservoir and the requirements that lead to success. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Josh Klahr (Pivotal), Gavin Sherry (EMC)
The emergence of Apache Hadoop over the past few years has required organizations to completely rethink architectures that have been in place for decades. And with changes in the underlying data fabric, come ripple effects, and often bottlenecks, that impact all levels of an organization both business and technical. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Bahman Bahmani (Stanford University)
Average rating: ****.
(4.25, 4 ratings)
In many modern web and big data applications the data arrives in a streaming fashion and needs to be processed on the fly. Due to the size of data, the computations need to be done incrementally, and hence sketches of data are used that take a small amount of memory but allow for fast updates and queries. We will present the techniques to design these sketches and provide clarifying examples. Read more.
Add to your personal schedule
Design Ballroom CD
Douglas van der Molen (ClearStory Data)
Average rating: ***..
(3.00, 5 ratings)
Whether the user is a business user or an IT user, with today's data complexity, there are a number of design principles that are key to achieving success. Hear how to approach product designing for today's data challenges and meet new user expectations for fast and timely insights at scale. Read more.

2:20pm

Add to your personal schedule
Beyond Hadoop Ballroom AB
Brian Granger (Cal Poly San Luis Obispo)
Average rating: ****.
(4.20, 5 ratings)
In this talk, I will introduce the IPython Notebook, an open-source, web-based interactive computing environment for Python and other languages. By enabling the data scientist to build documents that combine code, text, formulas, visualizations, images and video the Notebook creates a foundation for data science that is interactive, repeatable, documented and sharable. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Alan Gates (Hortonworks)
Average rating: ***..
(3.50, 4 ratings)
Big Data is about more than petabytes; it is also about new paradigms, languages, and tools. This talk will cover work going on in Hadoop projects to coordinate sharing of data and user code between tools. Read more.
Add to your personal schedule
Jo Prichard (LexisNexis Risk Solutions)
Average rating: ****.
(4.50, 4 ratings)
This session will demonstrate to attendees how easy it is to crowdsource identity theft to commit fraud and make money. We will look at which segments of the population are easy targets for large scale identity fraud. Attendees will be given methodologies to combat this type of fraud leveraging Big Data and various technologies. Read more.
Add to your personal schedule
Connected World Ballroom F
Mano Marks (Google, Inc. ), Brendan Kenny (Google)
Average rating: *....
(1.88, 8 ratings)
The world of mapping is undergoing another revolution. New techniques for visualizing and querying increasingly large amounts of data can lead to new ways of interacting with and discovering meaning in your data. In this session, we'll talk about the latest in vector mapping and how you can use it to explore the hidden stories in your data. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Average rating: ***..
(3.00, 1 rating)
Learn first-hand how advanced analytics are enabling modern enterprises to deal with big data challenges. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Anand Venugopal (Impetus Technologies Inc.), Vineet Tyagi (Impetus Technologies)
2012 was particularly interesting for the variety of Big Data use-cases implemented. This session explores key patterns across horizontal and vertical use cases. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Mauricio Vacas (Accenture Technology Labs), Fausto Inestroza (Accenture Technology Labs), Sonali Parthasarathy (Accenture Technology Labs)
Average rating: ***..
(3.00, 2 ratings)
With the growth in volume and velocity of data, businesses need a scalable solution alongside batch processing to process events on the fly and provide real time insights. In this session, we will describe how we used Storm to analyze network data to detect causes of network performance degradation. Read more.
Add to your personal schedule
Design Ballroom CD
Eric Legrand (Wells Fargo), Dana Zuber (Wells Fargo)
Average rating: **...
(2.33, 3 ratings)
This session explores applications of Shneiderman’s mantra for visual data analysis (overview first, zoom and filter, then details-on-demand) as a framework in the context of three complex analytical applications at Wells Fargo: (1) Analytics process, (2) Interactive meeting facilitation and (3) Dashboard design. Read more.

4:00pm

Add to your personal schedule
Data Science Ballroom AB
Bradley Voytek (UCSF & Uber, Inc.)
Average rating: **...
(2.50, 4 ratings)
With more data come more problems. Did you know Excel dates begin on January 1, 1900? Unless you're using the OS X version, then dates begin on January 1, 1904. Or Unix time, which begins January 1, 1970. These pervasive, easily-overlooked gremlins are the bane of any data scientist and in this session I will explore a variety of these little nuisances. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Matt Winkler (Microsoft)
Average rating: ****.
(4.00, 2 ratings)
In this session we’ll first discuss our experience extending Hadoop development to new platforms & languages and then discuss our experiments and experiences building supporting developer tools and plugins for those platforms. Read more.
Add to your personal schedule
Jim Adler (inome)
Average rating: ****.
(4.00, 3 ratings)
At Strata 2012 in New York, we discussed the hazards of curbing big data inferences by defining a new category of thoughtcrime. After all, acting on thoughts might constitute a crime, but thoughts, in isolation, cannot be criminal. It's time to go deeper. Let's create and evaluate a predictive criminal model that highlights where the sensitivities lie, both technically and ethically. Read more.
Add to your personal schedule
Connected World Ballroom F
Arfon Smith (University of Oxford)
Dealing with the flood of data that confronts researchers is the fundamental challenge of 21st century research. Citizen Science has allowed researchers within the Zooniverse to take on research problems at a scale impossible without the attention of a large community of volunteers. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Raanan Dagan (Splunk), Rahul Deshmukh (Splunk)
Average rating: ***..
(3.00, 3 ratings)
In this talk, we'll examine compelling, real-world examples that offer a blueprint for integrating big data technologies, delivering rapid visibility and insights to IT professionals, data analysts and business users, and that accelerate the adoption of big data in the enterprise. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Marie Bienkowski (SRI International), Jace Kohlmeier (Khan Academy), Zachary Pardos (Massachusetts Institute of Technology), Sharren Bates (inBloom)
Average rating: ****.
(4.00, 1 rating)
This panel will share insights on how K-16 education can benefit from developments in Big Data ecosystems. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Tim O'Brien (O'Reilly Media)
Average rating: *****
(5.00, 7 ratings)
While the industry has been busy abandoning the relational database and calling it a fundamentally limited technology, several trends are conspiring to revive the good old RDBMS. While it might not resemble the MySQL or Oracle database you are running today, this talk will explore how hardware trends, software trends, and industry research are point to SQL, structure, and ACID at scale. Read more.
Add to your personal schedule
Design Ballroom CD
Average rating: ****.
(4.44, 9 ratings)
From markup languages like SVG to OpenGL based APIs like WebGL, the browser provides several ways for creating visualizations. In this talk we'll show some web based visualizations we worked on for different projects and for Twitter, and show what standards were used to create them. We'll dissect each example showing what was used not only for rendering but also for data handling and interaction. Read more.

4:50pm

Add to your personal schedule
Data Science Ballroom AB
Vishwanath Ramarao (Impermium)
Average rating: **...
(2.60, 5 ratings)
Classic data science problems involve finding stationary patterns in big datasets. However, in adversarial settings, enemies deliberately shift their approach to avoid detection. They can challenge learning systems by randomizing behavior, hiding tracks, lacing traffic and more. Successful application of machine learning requires new approaches to feature engineering, training and classification. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Philip Zeyliger (Cloudera)
Average rating: ****.
(4.50, 2 ratings)
All is quiet on the log file front, but yet the system is down. What next? Three parts practical know-how (“here’s my toolbox”) and one part position paper (“must-haves for comprehensibility”), this talk will cover the tricks of the trade for debugging distributed systems. Motivated by experience gained diagnosing Hadoop, we’ll dig into the JVM, Linux esoterica, and outlier visualization. Read more.
Add to your personal schedule
Alysa Z. Hutnik (Kelley Drye & Warren LLP)
Average rating: ****.
(4.67, 3 ratings)
Privacy laws as to a company’s obligations on data collection, use, disclosure are changing rapidly. Failing to understand how the laws affect a company’s personal data assets can result in media exposes, regulatory investigations, Congressional hearings and lawsuits. This session will provide guidance on “privacy by design” compliance and practical tips to avoid becoming a target of scrutiny. Read more.
Add to your personal schedule
Connected World Ballroom F
Tyler Bell (Factual)
Average rating: ****.
(4.67, 3 ratings)
Factual believes that some data problems are bigger than any one company. This talk describes how Factual combines both machines and other (human) data communities to their best effect, within the context of similar data-centric, community-driven applications. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Carl Steinbach (Apache Software Organization)
Average rating: ****.
(4.75, 8 ratings)
This talk is about the emergence of a new class of analytic databases based on principles first popularized by Google Dremel. These systems have been designed with the goal of enabling real-time SQL on Hadoop, while also supporting schema-on-read, semi-structured data, and pluggable storage engines. In this talk we will explain the novel architectural features that make these goals a reality. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
John Santaferraro (Actian Corporation ), Walt Maguire (ParAccel)
ParAccel runs analytic queries 100x faster than Hive with much deeper SQL Support. Hear how companies are using analytic platforms for fast, interactive analysis on big data. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Jim Kelly (Quantcast)
This talk introduces an open-source distributed file system that will double the capacity of your Hadoop cluster and speed up your MapReduce jobs. The talk will describe the Reed-Solomon implementation and its implications for cluster performance, how it leverages the speed of modern networks to achieve better storage efficiency and make Hadoop jobs run faster. Read more.
Add to your personal schedule
Design Ballroom CD
Lynwood Bishop (Map Large, Inc.)
Average rating: ***..
(3.50, 2 ratings)
The human eye can detect infinitesimal patterns in the world around us. Shouldn’t we make use of this amazing skill when recognizing patterns or detecting anomalies in big data? In this session we’ll explore why rendering every pixel is a challenge with big data and look at how these limitations can be overcome. Read more.

5:30pm

Add to your personal schedule
Expo Hall AB
Average rating: ***..
(3.00, 1 rating)
Quench your thirst with vendor-hosted libations and snacks while you check out all the cool stuff in the Expo Hall. Read more.

9:00pm

Add to your personal schedule
Hotel Bar, Santa Clara Hyatt
A casual get-together for conference-goers after a busy day at Strata Read more.

Thursday, 02/28/2013

8:45am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Alistair Croll (Solve For Interesting), Edd Dumbill (Silicon Valley Data Science)
Average rating: ****.
(4.00, 3 ratings)
Program Chairs, Edd Dumbill and Alistair Croll, welcome you to the second day of keynotes. Read more.

8:55am

Add to your personal schedule
Mission City Ballroom
Cecilia Bouras (Western Union)
Average rating: ***..
(3.69, 16 ratings)
In this key note, we will explore some of the challenges of big data operating in a truly global context. Read more.

9:05am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Dave Campbell (Microsoft)
Average rating: ***..
(3.71, 14 ratings)
Microsoft keynote, featuring Dave Campbell, Vice President of Product Development for the SQL Server product suite. Read more.

9:15am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Average rating: ****.
(4.08, 12 ratings)
In this talk, we present the broad data challenge and discuss potential starting points for solutions. We illustrate these approaches using data from a "meta-catalog" of over 1,000,000 open datasets that have been collected from about two hundred governments from around the world. Read more.

9:25am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Girish Juneja (Intel)
Average rating: *....
(1.50, 14 ratings)
How software can transform human lives by bringing intelligence to wherever big data lives. Read more.

9:30am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Joydeep Das (SAP)
Average rating: *....
(1.78, 18 ratings)
Hadoop and SAP HANA are taking the world by storm. SAP HANA is the fastest growing commercial database in the market, being adopted by the world’s top enterprises for real-time analytics and applications. Read more.

9:35am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Nathan Marz (Twitter)
Average rating: ****.
(4.21, 19 ratings)
Designing for human fault-tolerance leads to important conclusions on the fundamental ways data systems should be architected. Read more.

9:45am

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Kate Crawford (Microsoft Research)
Average rating: ****.
(4.61, 31 ratings)
Big data gives us a powerful new way to see patterns in information - but what can't we see? When does big data not tell us the whole story? This talk opens up the question of the biases we bring to big data, and how we might work beyond them. Read more.

10:40am

Add to your personal schedule
Data Science Ballroom AB
Justin Langseth (Zoomdata, Inc.), Byron Ellis (Spongecell)
Average rating: ***..
(3.83, 6 ratings)
Learn how LivePerson and Zoomdata perform stream processing and visualization on mobile devices of structured site traffic and unstructured chat data in real-time for business decision making. Technologies include Kafka, Storm, and d3.js for visualization on mobile devices. Byron Ellis, Data Scientist for LivePerson will join Justin Langseth of Zoomdata to discuss and demonstrate the solution. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Philip Kromer (Infochimps)
Average rating: *....
(1.00, 1 rating)
Join Flip Kromer, co-founder and CTO of Infochimps, as he walks you through a series of decision trees, making you rethink your use of Hadoop in the cloud and opening up possibilities for new patterns of work that are uniquely developer-friendly. Patterns of work like tuning your cluster to the job, and why the first priority of any analytics cluster should be downtime. Read more.
Add to your personal schedule
Fred Trotter (FredTrotter.com)
At Strata RX, we announced the release of DocGraph, the largest open named social graph data set that we know of. This data set included links between doctor who commonly team together in the Medicare dataset. Since then, we have added tremendous depth to the data by crowdfunding the acquisition of doctor credentialing data. Come learn how healthcare works under the cover. Read more.
Add to your personal schedule
Connected World Ballroom F
Carson Darling (Rest Devices)
Average rating: ****.
(4.25, 4 ratings)
This talk will discuss Rest Devices proprietary low-cost sensor technology, its use of and vision for big biometric data, and the need for design integration in all facets of product development, be it software or hardware. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Greg Khairallah (Intel), Bert Haskell (Pecan Street Projects)
Real-world examples of utility companies around the world using Hadoop to optimize their services and changing Hadoop in the process. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Eron Kelly (Microsoft Corporation), Paul Henderson (Ascribe)
Microsoft partner, Ascribe, is using Microsoft’s Big Data solutions to turn emergencies into actionable data Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Stephan Ellner (Google), Jeff Shute (Google)
Average rating: *****
(5.00, 5 ratings)
Many of the services that are critical to Google’s ad business have historically been backed by MySQL. We have recently migrated several of these services to F1, a new RDBMS developed at Google. F1 implements rich relational database features, including a strictly enforced schema, a powerful parallel SQL query engine, general transactions, change tracking and notification, and indexing. Read more.
Add to your personal schedule
Design Ballroom CD
Amy Heineike (Quid)
Average rating: ****.
(4.00, 11 ratings)
The majority of data we consume today are presented in lists, one-dimensional orderings that limit the users ability to understand context or perform strategic analyses. For unstructured data, we need to re-imagine what types of visualisations enable exploration in the way that geographic maps can. Read more.

11:30am

Add to your personal schedule
Data Science Ballroom AB
Philipp Janert (Principal Value, LLC)
Average rating: **...
(2.67, 3 ratings)
Most stable systems rely on feedback - from central heating to industrial plants and biological organisms. This introductory talk will explain what feedback is, why it is relevant to enterprise software development, and how to apply it to some typical problems arising in business and technical situations. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Shaun Connolly (Hortonworks), Tasso Argyros (Teradata Aster)
Average rating: ***..
(3.00, 2 ratings)
Apache Hadoop is an innovative emerging technology causing CIOs to rethink their data architecture - making this an exciting time to be a “big data” technologist. This tag-team presentation brings leaders in both Apache Hadoop and data warehousing on the stage, to answer these questions by sharing their vision for the future of big data management and analytics. Read more.
Add to your personal schedule
John Foreman (MailChimp)
Average rating: ****.
(4.67, 9 ratings)
Hear from MailChimp’s Chief Scientist John Foreman as he dishes on dirty data and demonstrates the latest in MailChimp’s anti-abuse artificial intelligence. MailChimp sends 3 billion emails a month for their millions of users, and they can't afford to let a drop of spam go out. Learn how the company is using cutting edge NoSQL solutions and predictive models to leave the bad guys out in the cold. Read more.
Add to your personal schedule
Connected World Ballroom F
Ben Waber (Sociometric Solutions)
Average rating: *****
(5.00, 2 ratings)
I will discuss how a wearable sensing platform, the Sociometric Badge, allows us to measure and analyze human behavior in the real-world, particularly in the workplace. We’ll discuss how we use the badges to recognize concepts such as persuasiveness and social support and how we have used the badges in real companies to drive organizational change and put hard numbers behind management methods. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Michael Lang Sr. (Revelytix)
Managing data in Hadoop gets complex quickly - *Loom* is the data set management system for Hadoop that makes it easy. *Loom* provides tools to track the lineage and provenance of all registered HDFS data, and *Activescan* so that all of the critical information about data sets is collected dynamically. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Charles Zedlewski (Cloudera)
Cloudera, the standard for Apache Hadoop in the enterprise, empowers data-driven enterprises to Ask Bigger Questions™ and get bigger answers from all their data at the speed of thought. Cloudera Enterprise, the platform for Big Data, enables organizations to easily derive business value from structured and unstructured data to achieve a significant competitive advantage. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
C. Aaron Cois (Carnegie Mellon University, Software Engineering Institute), Tim Palko (Carnegie Mellon University, Software Engineering Institute)
Average rating: ****.
(4.67, 3 ratings)
In this talk, we describe using Redis, an open source, in-memory key-value store, to capture large volumes of data from numerous remote sources while also allowing real-time monitoring and analytics. With this approach, we were able to capture a high volume of continuous data from numerous remote environmental sensors while consistently querying our database for real time monitoring and analytics. Read more.
Add to your personal schedule
Design Ballroom CD
Average rating: ***..
(3.83, 6 ratings)
This talk discusses the broad design considerations necessary for effective visualizations. Attendees will learn what's required for a visualization to be successful, gain insight for critically evaluating visualizations they encounter, and come away with new ways to think about the visualization design process. Read more.

12:10pm

Add to your personal schedule
Expo Hall C
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Wed 2/27 and Thu 2/28. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area. Read more.

1:30pm

Add to your personal schedule
Data Science Ballroom AB
Alexander Gray (Skytree, Inc.)
Average rating: ***..
(3.30, 10 ratings)
Given a machine learning (ML) problem, which method(s) should you use, and how does big data affect your choices? I will discuss some principles derived from decades of theory and practice, illustrated through real-world ML success stories in medicine, marketing, financial services, and astronomy. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Paco Nathan (The Data Guild)
Average rating: *****
(5.00, 2 ratings)
This talk examines the notion of a "workflow" as a general abstraction for common use cases encountered in Data Science, particularly for building Enterprise apps. Patterns of workflows provide recipes for integrating different frameworks, plus the means for optimizing large-scale apps. We review this approach in the context of a sample app based on the Cascading open source project. Read more.
Add to your personal schedule
Sandra Crucianelli (International Center for Journalists), Angélica Peralta Ramos (La Nacion Newspaper)
Average rating: ****.
(4.00, 1 rating)
A way to introduce the idea that access to Big Data in many countries – especially Argentina – is still a work in progress and somewhat politicized. Despite that, media like La Nacion Newspaper, are working with developers and experts in Data Viz to address the lack of transparency and accountability. Read more.
Add to your personal schedule
Connected World Ballroom F
Anna Smith (bitly)
Average rating: ****.
(4.67, 3 ratings)
While audience analysis is an old topic, it is being reimagined as personas along topic distributions as opposed to the usual demographic terms. This provides deeper insights into the communities among the internet that provide interesting insights into how the internet is consumed. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Priyank Patel (Teradata Aster)
Average rating: **...
(2.67, 3 ratings)
MapReduce, Hadoop, and other “NoSQL” big data approaches opened opportunities for data scientists in every industry to develop new data-intensive applications. But what about the more traditional SQL users or analysts? How can they unlock insights through standard business intelligence (BI) tools or ANSI SQL access? Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Nick Kolegraff (Rackspace)
Average rating: ***..
(3.80, 5 ratings)
Data Science has created quite the movement in the data world, yet confusion between data science and analytics still remain across the enterprise. Rather than approach the subject talking about semantic differences between the two, we will discuss the topics as they relate to solving problems, how businesses are approaching them and what you can start doing with data science. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
John A. De Goes (Precog)
This talk discusses the market needs that are giving birth to the "scientific database", what these systems have to offer that is currently lacking in either the data management or statistical worlds, and how scientific databases will co-exist and co-evolve with Hadoop and other leading big data platforms. Read more.
Add to your personal schedule
Design Ballroom CD
Robert Munro (Idibon)
Average rating: *****
(5.00, 1 rating)
The majority of the world's data is now unstructured, non-English text. How can we extract useful information from it? Many of our assumptions about English do no carry over to other languages. This talk will give a high-level overview of how languages vary, what current language technologies can (and cannot) achieve, and how we can process and visualize this information at scale. Read more.

2:20pm

Add to your personal schedule
Data Science Ballroom AB
Dr. Vijay Srinivas Agneeswaran (Impetus Technologies)
Average rating: ***..
(3.00, 2 ratings)
The key takeaway from this session will be an understanding of the third generation of tools for realizing machine learning algorithms - examples of these tools include Twister, HaLoop, GraphLab. Attendees will also understand why the second generation tools such as Mahout has not implemented some of the machine learning algorithms for big data. The session will also have real-life use cases. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Milind Bhandarkar (Greenplum, A Division of EMC), Chaitan Baru (SDSC/UC San Diego)
Average rating: **...
(2.00, 1 rating)
We will describe the BigData Top100 List initiative—an new, open, community-based effort for benchmarking big data systems. Read more.
Add to your personal schedule
Lisa Green (Common Crawl), Greg Lindahl (blekko), Kevin Burton (Spinn3r)
Average rating: ****.
(4.50, 2 ratings)
Big data tools made it possible to gain extremely valuable insight from large scale analysis of web data, but until recently few people had access to the data. Now tools like Grep the Web and increased raw access to web data grant anyone the power to do such analysis. This presentation addresses practical applications of web data analysis that you can incorporate into your research or products. Read more.
Add to your personal schedule
Connected World Ballroom F
Sam Shah (LinkedIn), Peter Skomoroch (Data Wrangling)
Average rating: ****.
(4.40, 5 ratings)
Learn how LinkedIn endorsements used data mining techniques to develop a viral social tagging and reputation system. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
David Henry (Pentaho), Benjamin Lloyd (NetApp)
Attend this session to hear how NetApp was able to solve their big data problem. Since the design and implementation of the solution, NetApp has a number of takeaways and best practices required to convert theory into practice, allowing completion of an enterprise-level implementation of such a solution. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Mike Peterson (Neustar)
Learn how Neustar has expanded their data warehouse capacity, agility for data analysis, reduced costs, and enabled new data products. Discuss challenges and opportunities in capturing 100′s of TB’s of compact binary network data, ad hoc analysis, integration with a scale out relational database, more agile data development, and building new products integrating multiple big data sets. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Justin Erickson (Cloudera)
Average rating: ****.
(4.33, 3 ratings)
The Cloudera Impala project is for the first time making scalable parallel database technology, which is the underpinning of Google's Dremel as well as that of commercial analytic DBMSs, available to the Hadoop community. Read more.
Add to your personal schedule
Design Ballroom CD
Louis Perrochon (Google)
Average rating: ****.
(4.50, 4 ratings)
Crunch 40 years worth of daily global satellite data at the push of a button, perform spatial analyses on GBs of your own GIS data and securely share the results privately or publish to 1B Google Earth users. This talk will focus on how what was once the realm of a few is now easily and intuitively accessible from the comfort of your Chrome browser. Read more.

4:00pm

Add to your personal schedule
Data Science Ballroom AB
Michael Bean (Forio Simulations)
Average rating: ****.
(4.50, 2 ratings)
Julia is a new mathematical programming language that is scalable, high-performance, and open source. Julia is fast, approaching and often matching the performance of C/C++, easy to learn, and designed for distributed computation. This session will demonstrate some of the special capabilities of Julia and give you the tools you need to get started using this exciting technical computing language. Read more.
Add to your personal schedule
Hadoop in Practice Great America Ballroom K
Jayant Shekhar (Cloudera Inc)
Average rating: ***..
(3.60, 5 ratings)
This talks dives into the extreme details of Building Recommendation Platforms. It covers the end to end Architecture and Design of such a system. It dives into the various ML Algorithms to be used along with their details. It also covers the Solutions to commonly seen Recommendation Patterns and detailed Use Cases along with their Solution. Read more.
Add to your personal schedule
Dean Malmgren (Datascope Analytics), Michael Stringer (Datascope Analytics)
Electronic discovery has transformed the way cases are litigated. Gone are the days of manual review, where litigators spent days poring over emails, messages, and documents. Today's e-discovery technologies mine through vast troves of information, looking for the needle in the proverbial haystack that will blow a case wide open. Read more.
Add to your personal schedule
Nadav Aharony (Behavio)
Average rating: ****.
(4.29, 7 ratings)
Today's smartphones have evolved into incredibly rich sensing and computing devices, that can be used to infer complex and interesting things about us, our environment, and our communities. This talk will give an overview of user-centric, continuous mobile sensing, and our work, originating at the MIT Media Lab, to develop open tools to democratize this capability. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom G
Natasha Gajic (Rackspace)
Come learn about ACG, Analytical Compute Grid, a solution Rackspace built leveraging OpenStack, Big Data and NoSQL to help end users manage complex information and data. Read more.
Add to your personal schedule
Sponsored Sessions Ballroom H
Sanjai Marimadaiah (Hewlett Packard), Luis Maldonado (HP Vertica)
Learn how HP has established itself as the premier Big Data vendor with a solid portfolio of turnkey solutions that can be deployed faster than ever, while keeping acquisition and operational costs down. Learn more at hp.com/go/information. Read more.
Add to your personal schedule
Beyond Hadoop Great America Ballroom J
Eric Tschetter (Metamarkets), Danny Yuan (Netflix Platform Engineering Team)
This talk will discuss how Druid allows users to have interactive queries on real-time data at scale; we feature a case study with Netflix leveraging Druid to obtain at-the-moment insight as it ingests over two terabytes per hour. Read more.
Add to your personal schedule
Design Ballroom CD
Alexander Gray (Skytree, Inc.), Monica Rogati (Jawbone), Julie Steele (O'Reilly Media, Inc.), Douglas van der Molen (ClearStory Data)
Average rating: ****.
(4.33, 3 ratings)
The Great Debate series returns to Strata. In this Oxford-style debate, two opposing teams take opposing positions. We poll the audience, and the teams try to sway opinions. It'll be a fast-paced, sometimes irreverent look at some of the core challenges of putting data to work. Read more.

4:50pm

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Kenneth Cukier (The Economist)
Average rating: ****.
(4.40, 5 ratings)
As big data makes inroads into all aspects of society, how governments regard the technology will be critical for its success. If the past is a guide, the state will embrace big data for its own uses (both good and ill). It will recognize that its authority is threatened and lash out Read more.

5:10pm

Add to your personal schedule
Mission City Ballroom
This presentation will be streamed live.
Sasha Issenberg (The Victory Lab)
Average rating: **...
(2.50, 4 ratings)
The Victory Lab presents a secret history of modern American politics, pulling back the curtain on the tactics and strategies used by some of the era's most important figures-including Barack Obama and Mitt Romney-with iconoclastic insights into human decision-making, marketing and how analytics can put any business on the road to victory. Read more.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts