Strata 2012 Schedule

Below are the confirmed and scheduled talks at Strata 2011 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the tutorials, sessions, keynotes, and events you want to attend by clicking on the calendar icon [calendar icon] next to each listing. Then click on "personal schedule" below and get your own customized schedule generated.

GA J
10:40am RHadoop, R meets Hadoop Antonio Piccolboni (Per data LLC)
11:30am Monitoring Apache Hadoop - a big data problem? Henry Robinson (Cloudera)
1:30pm How to develop Big Data Pipelines for Hadoop Mark Pollack (SpringSource/VMware)
4:50pm Analyzing Hadoop Source Code with Hadoop Stefan Groschupf (Datameer)
Ballroom E
11:30am Understanding Social Contagion Marcel Salathé (Penn State University)
1:30pm Changing Data Standards from Wall Street to DC and Beyond John Mulholland (Fannie Mae)
2:20pm Big Data: Wall Street Style Jen Zeralli (S&P Capital IQ), Jeff Sternberg (S&P Capital IQ)
4:00pm Big Data = Bigger Metadata Ian White (Urban Mapping, Inc)
Ballroom G
11:30am Creating Real Business Value with Big Data Analytics Mike Maxey (Greenplum), Katrin Ribant (Havas Digital), Jeff Carey, Keaton Adams (McAfee)
1:30pm Synergies of Column Storage and Map Reduce for Big Data Analytics Jim Tommaney (InfiniDB), Fernanda Foertter (Genus plc)
2:20pm Amazon DynamoDB: A seamlessly scalable NoSQL service Swaminathan Sivasubramanian (Amazon Web Services)
4:00pm Getting the Most from Your Hadoop Big Data Cluster Rohit Valia (Platform Computing)
4:50pm TBC
Mission City B1
10:40am Dealing With Bad Data Q Ethan McCallum (@qethanm)
11:30am Street Fighting Data Science Peter Skomoroch (Data Wrangling)
1:30pm Data Ingest, Linking, and Data Integration via Automatic Code Generation Tony Middleton (HPCC Systems from LexisNexis Risk Solutions)
2:20pm Disambiguation: Embrace wrong answers & find truth Philip (Flip) Kromer (CSC)
4:00pm Netflix recommendations: beyond the 5 stars Xavier Amatriain (Netflix)
4:50pm Data Science in Product Development Joris Poort (Startup)
Ballroom H
10:40am Turning Big Data Into Competitive Advantage Eddie Satterly (Splunk), Sanjay Mehta (Splunk)
11:30am Unleash Insights On All Data With Microsoft Big Data Alexander Stojanovic (Microsoft), Martin Hall (Karmasphere), Eric Baldeschwieler (Independent)
1:30pm SQLFire - An Ultra-fast, Memory-optimized Distributed SQL Database Carter Shanklin (VMware), Jags Ramnarayan (Vmware)
4:50pm TBC
Mission City B4
10:40am Mo’ Data, Mo’ Problems Josh Green (Panjiva)
11:30am Business Management Strategies for Big Data Dave Rubin (Oracle)
1:30pm Becoming a Data-Driven Organization Martin Hall (Karmasphere), Ron Bodkin (Think Big Analytics)
4:00pm Analytics in a Community-Driven Fashion Retailer Kuntal Malia (ModCloth), Kate Zimmerman (ModCloth)
4:50pm Data Science in Marketing Analytics Christopher Berry (Syncapse)
Ballroom AB
10:40am Science of Visualization Jock Mackinlay (Tableau Software)
11:30am Effective Data Visualization Hjalmar Gislason (DataMarket)
1:30pm Building a Data Narrative: Discovering Haight Street Jesper Andersen (Bloom Studios)
2:20pm Crafting Meaningful Data Experiences Bitsy Bentley (GfK Custom Research)
4:00pm Roll Your Own Front End: A Survey of Creative Coding Frameworks Michael Edgcumbe (Columbia University), Eric Mika (The Department of Objects)
4:50pm Sketching With Data Fabien Girardin (Near Future Laboratory)
Ballroom CD
10:40am The Future of Hadoop: Becoming an Enterprise Standard Eric Baldeschwieler (Independent)
11:30am I Didn't Know You Could Do All that with Hadoop Jack Norris (MapR Technologies)
1:30pm Collaborative Filtering using MapReduce Sam Shah (LinkedIn)
2:20pm Hadoop + JavaScript: what we learned Asad Khan (Microsoft)
4:00pm Architecting Virtualized Infrastructure for Big Data Richard McDougall (VMware)
4:50pm Aggregating and serving local places data and ads at Citygrid Ana Martinez (CityGrid Media), Kin Lane (API Evangelist)
8:45am Plenary
Room: Mission City Ballroom
Welcome Edd Dumbill (Silicon Valley Data Science), Alistair Croll (Solve For Interesting)
8:50am Plenary
Room: Mission City Ballroom
The Apache Hadoop Ecosystem Doug Cutting (Cloudera)
9:00am Plenary
Room: Mission City Ballroom
Do We Have The Tools We Need To Navigate The New World Of Data? Dave Campbell (Microsoft)
9:10am Plenary
Room: Mission City Ballroom
Decoding the Great American ZIP myth Abhishek Mehta (Tresata)
9:20am Plenary
Room: Mission City Ballroom
Guns, Drugs and Oil: Attacking Big Problems with Big Data Mike Olson (Cloudera)
9:30am Plenary
Room: Mission City Ballroom
Machine Learning and Big Data: Sustainable Value or Hype? Flavio Villanustre (LexisNexis Risk Solutions and HPCC Systems)
9:35am Plenary
Room: Mission City Ballroom
Learning Analytics: What Could You Do With Five Orders of Magnitude More Data About Learning? Steve Schoettler (Junyo)
9:40am Plenary
Room: Mission City Ballroom
A Big Data Imperative: Driving Big Action Avinash Kaushik (Market Motive)
9:55am Plenary
Room: Mission City Ballroom
The Information Architecture of Medicine is Broken Ben Goldacre (Bad Science)
10:10am Morning Break
Room: Exhibit Hall
12:10pm Lunch sponsored by EMC²
Room: Exhibit Hall
Wednesday Lunchtime BoF Tables
3:00pm Afternoon Break sponsored by MarkLogic
Room: Exhibit Hall
5:30pm Plenary
Room: Exhibit Hall
Expo Hall Reception
8:00am Coffee Break sponsored by NetApp
Room: Mission CIty Ballroom Foyer
6:30pm Plenary
Room: Mission CIty Ballroom Foyer
Strata 2012 Startup Showcase
10:40am-11:20am (40m) Hadoop & Big Data: Tech
RHadoop, R meets Hadoop
Antonio Piccolboni (Per data LLC)
R and Hadoop, the two hottest stars on the Analytics stage, were meant to be together. The open source RHadoop project was established to make it happen. We'll go over what RHadoop does for you, how to use it, and why you should add it to your toolset.
11:30am-12:10pm (40m) Hadoop & Big Data: Tech
Monitoring Apache Hadoop - a big data problem?
Henry Robinson (Cloudera)
At Cloudera, we've found that monitoring Apache Hadoop is itself a big data problem. Here I'll present work we've been doing on turning the vast amounts of monitoring data a Hadoop cluster generates into meaningful signals to help us wrestle with the biggest challenges of maintaining large distributed systems: failure of machines, processes and people, and root-cause analysis after-the-fact.
1:30pm-2:10pm (40m) Hadoop & Big Data: Tech
How to develop Big Data Pipelines for Hadoop
Mark Pollack (SpringSource/VMware)
Hadoop is not an island. To deliver a complete Big Data solution, a data pipeline needs to be developed that incorporates and orchestrates many diverse technologies. Using an example of real-time weblog processing, in this session we will demonstrate how the open source Spring Batch and Spring Integration projects can be used to build manageable and robust pipeline solutions around Hadoop.
2:20pm-3:00pm (40m) Hadoop & Big Data: Tech
How Crunch Makes Writing, Testing and Running of MapReduce Pipelines Easy, Efficient and Even Fun!
Josh Wills (Cloudera)
Cloudera Data Scientist Josh Wills will share insights and “how to” tricks about Crunch, a Java library that aims to make writing, testing and running MapReduce pipelines that run over any type of data easy, efficient and even fun.
4:00pm-4:40pm (40m) Hadoop & Big Data: Tech
Hadoop Plugin for MongoDB: The Elephant in the Room
Steve Francia (10gen)
Learn how to integrate MongoDB with Hadoop for large-scale distributed data processing.
4:50pm-5:30pm (40m) Hadoop & Big Data: Tech
Analyzing Hadoop Source Code with Hadoop
Stefan Groschupf (Datameer)
Using Hadoop based business intelligence analytics, we analyzed Hadoop source code over time. This talk illustrates text and related analytics with Hadoop on Hadoop to reveal the true hidden secrets of the elephant. This entertaining session highlights the value of data correlation across multiple datasets and the visualization of those correlations to reveal hidden data relationships.
10:40am-11:20am (40m) Domain Data
Exploring Social Data: Use Cases for Real-World Application
Chris Moody (Gnip)
With billions of social activities passing through the ever-growing realtime social web each day, companies are beginning to harness the power of social data. In this session, participants will learn from real-world case studies in Financial Services, Emergency Response, Brand Analytics and other industries about how businesses are applying social data to their operations to drive value.
11:30am-12:10pm (40m) Domain Data
Understanding Social Contagion
Marcel Salathé (Penn State University)
Who influences whom? Data science can help answering this question which is of fundamental importance to business, politics, public health and many others.
1:30pm-2:10pm (40m) Domain Data
Changing Data Standards from Wall Street to DC and Beyond
John Mulholland (Fannie Mae)
Pascal Boillat, Fannie Mae’s Chief Information Officer, will address how changing data standards and implementation strategies is having a profound effect on the financial services industry.
2:20pm-3:00pm (40m) Domain Data
Big Data: Wall Street Style
Jen Zeralli (S&P Capital IQ) et al
Topics will span the data flow lifecycle from data collection, curation and quality, to aggregation and standardization of a multitude of complex data sources, to the creation of valuable analytics, including recommendations that connect users to the data.
4:00pm-4:40pm (40m) Domain Data
Big Data = Bigger Metadata
Ian White (Urban Mapping, Inc)
Federal transparency initiatives have spawned millions of rows of data, state and local programs engage developers and wonks with APIs, contests and data galore. Private industry offers attribute-laden device exhaust, forming a geo-footprint of who is going where, when, how and (maybe) for what. Who decides data provenance? Does curated data get treated the same as heterogeneous data?
4:50pm-5:30pm (40m) Domain Data
Linked Data: Turning the Web into a Context Graph
Leigh Dodds (Kasabi)
Facebook's Open Graph, Schema.org, and a recent scramble towards a "Rosetta Stone" for geodata, are all examples of a trend towards linking data across the web. Weaving data into the web simplifies integration. Big Data offers ways to mine huge datasets for insight. Linked Data turns the web into a dataset
10:40am-11:20am (40m) Sponsored Session
Data as a Strategic Weapon - Walmart, Netfix and Apigee Panel Discussion
Billy Bosworth (DataStax)
In this panel discussion, DataStax CEO BIlly Bosworth will moderate a discussion that will spotlight real mission critical Big Data use cases from "hands-on" practitioners. With companies like Walmart, Netflix, & Apigee among many others adopting Apache Cassandra and other new database technologies, there's never been a more exciting time to be building data intensive applications.
11:30am-12:10pm (40m) Sponsored Session
Creating Real Business Value with Big Data Analytics
Mike Maxey (Greenplum) et al
The race is on to create the next competitive advantage. Attend this customer session for a brief introduction to Greenplum’s Big Data Analytics platform.
1:30pm-2:10pm (40m) Sponsored Session
Synergies of Column Storage and Map Reduce for Big Data Analytics
Jim Tommaney (InfiniDB) et al
Advances in columnar databases are creating bio-science opportunities that were previously not possible. Fernanda Foertter and the team at Genus discovered an innovative way to store and access the huge volumes of data being generated modeling genotypes. She and Jim Tommaney discuss the benefits of column storage and how InfiniDB’s Map Reduce empowers high performance Big Data analytics.
2:20pm-3:00pm (40m) Sponsored Session
Amazon DynamoDB: A seamlessly scalable NoSQL service
Swaminathan Sivasubramanian (Amazon Web Services)
Running large scale datastores requires us to handle various challenges such as scalability, reliability, performance, and reduced operational overhead. In this talk, we will discuss how Amazon DynamoDB was designed to address these problems.
4:00pm-4:40pm (40m) Sponsored Session
Getting the Most from Your Hadoop Big Data Cluster
Rohit Valia (Platform Computing)
This session looks at the requirements for a multi-tenant big data cluster: one where different lines of businesses, different projects, and multiple applications can be run with assured SLAs, resulting in higher utilization and ROI for these clusters.
4:50pm-5:30pm (40m)
Session
To be confirmed
10:40am-11:20am (40m) Data Science
Dealing With Bad Data
Q Ethan McCallum (@qethanm)
The biggest problem in data science is ... the data itself.
11:30am-12:10pm (40m) Data Science
Street Fighting Data Science
Peter Skomoroch (Data Wrangling)
New analysts or engineers are often lost when textbook approaches fail on real world data. Drawing inspiration from problem solving techniques in mathematics and physics, we will walk through examples that illustrate how come up with creative solutions and solve real world problems with data.
1:30pm-2:10pm (40m) Data Science
Data Ingest, Linking, and Data Integration via Automatic Code Generation
Tony Middleton (HPCC Systems from LexisNexis Risk Solutions)
How to simplify the data integration process and save a significant amount of development time by automatically generating code for processes (data profiling, data cleansing, and record linkage). A case study will show a complex, Big Data linking application, where insurance data was converted to HPCC using the SALT tool and reduced 20,000+ lines of source code to a 48-line SALT specification.
2:20pm-3:00pm (40m) Data Science
Disambiguation: Embrace wrong answers & find truth
Philip (Flip) Kromer (CSC)
Instead of working too hard to define the parameters in an attempt to completely remove the ambiguity, look at what people do, interact with and talk about. We can watch what people do and decide from there what a coffee shop is and where the boundaries of your neighborhood are. It might not be the “truth”, but it can be darn close.
4:00pm-4:40pm (40m) Data Science
Netflix recommendations: beyond the 5 stars
Xavier Amatriain (Netflix)
Netflix is known for pushing the envelope of recommendation technologies. The Netflix Prize put a spotlight on recommender system research and a focus on predicting ratings. But, predicting a rating is only part of the recommendation problem. In this talk I will describe how other sources of implicit and contextualized information can be used to create a personalized experience.
4:50pm-5:30pm (40m) Data Science
Data Science in Product Development
Joris Poort (Startup)
Data science applied in engineering driven industries is revolutionizing how highly complex products are developed. Unprecedented access to computing power combined with advanced data science tools provide the opportunity to not only increase the speed of development but also improve the final design. Using a practical aerospace example, Joris will illustrate the tools and techniques described.
10:40am-11:20am (40m) Sponsored Session
Turning Big Data Into Competitive Advantage
Eddie Satterly (Splunk) et al
In this session, Expedia, one of the world’s leading online travel companies, describes how they tapped into their massive machine data to deliver unprecedented insights across key IT and business areas – from ad metrics and risk analysis, to capacity planning, security, and availability analysis.
11:30am-12:10pm (40m) Sponsored Session
Unleash Insights On All Data With Microsoft Big Data
Alexander Stojanovic (Microsoft) et al
Microsoft's Big Data solution turns signals into information. Learn how Microsoft's Hadoop service and rich BI capabilities can drive your business forward.
1:30pm-2:10pm (40m) Sponsored Session
SQLFire - An Ultra-fast, Memory-optimized Distributed SQL Database
Carter Shanklin (VMware) et al
Today's users won't tolerate slow applications. More often than not, the database is the bottleneck in the application. Learn how VMware vFabric SQLFire can give you the speed and scale you need in a substantially simpler way. SQLFire is a memory-optimized and horizontally-scalable distributed SQL database. Attend this session to learn how SQLFire gives high performance without the complexity.
2:20pm-3:00pm (40m) Sponsored Session
MapReduce for the Rest of Us: Unlocking Data Science for the Business User
Tasso Argyros (Teradata Aster)
This session will explore a new class of analytic platforms and technologies such as SQL-MapReduce® which bring the science of data to the art of business. By fusing standard business intelligence and analytics with next-generation data processing techniques such as MapReduce, big data analysis is no longer just in the hands of the few data science or MapReduce specialists in an organization!
4:00pm-4:40pm (40m) Sponsored Session
Automated Understanding – The Next Evolution in Big Data Analytics
Tim Estes (Digital Reasoning)
Data Scientists must deal with many Big Data challenges including volume, velocity and variety of data. These challenges require a new solution - Automated Understanding - a new evolution in software. In this session Tim Estes will show the power of this new capability on a large and valuable dataset that has never been deeply understood by software before.
4:50pm-5:30pm (40m)
Session
To be confirmed
10:40am-11:20am (40m) Business & Industry
Mo’ Data, Mo’ Problems
Josh Green (Panjiva)
Despite the hype, Big Data has yet to live up to its potential. Why? Because we’ve spent too much time thinking about the data itself and not enough time considering which business decisions can be improved through the intelligent application of data. Panjiva CEO Josh Green will discuss an alternative approach: starting with a challenging business problem and then tracking down relevant data.
11:30am-12:10pm (40m) Business & Industry
Business Management Strategies for Big Data
Dave Rubin (Oracle)
There is a revolution at hand centering on this groundswell of data and it will change how we execute our businesses through greater efficiencies, new revenue discovery and even enable innovation. It is the revolution of Big Data. Management Strategies for Big Data will explain this new wave of technology and provide a roadmap for businesses to take advantage of this growing trend.
1:30pm-2:10pm (40m) Business & Industry
Becoming a Data-Driven Organization
Martin Hall (Karmasphere) et al
While enterprises see an opportunity to increase revenues and decrease costs by becoming a data-driven organization, it is not easy to decide where and how to begin. This session highlights some principles for success through examining two real-world big data case studies.
2:20pm-3:00pm (40m) Business & Industry
Building a Data Strategy: Data Enabling Toys at Leapfrog
Larry Murdock (Lyris)
Leapfrog enabled their learning toys and set up a system to have millions of toy owners upload their play logs. This talk covers the business strategy and the technical implementation hurdles from perspective of the former Director of Data Services who implemented it.
4:00pm-4:40pm (40m) Business & Industry
Analytics in a Community-Driven Fashion Retailer
Kuntal Malia (ModCloth) et al
Learn about how data is used for a fashion retailer that is on a rapid growth path. At ModCloth we don't believe in dictating fashion trends to our customer—we are inverting the pyramid and democratizing fashion. Buying patterns and user interactions are leveraged to help us understand how we can meet our customers' desires
4:50pm-5:30pm (40m) Business & Industry
Data Science in Marketing Analytics
Christopher Berry (Syncapse)
Moneyball is to marketing science as CSI is to forensic science. The expectations are high and marketers are shouting "where's the insight?" and "ENHANCE!". Data is long and marketing scientists are short. We can only scale through technology. This is the story of how a developer and two marketing scientists became data scientists in crossing that gap.
10:40am-11:20am (40m) Visualization & Interface
Science of Visualization
Jock Mackinlay (Tableau Software)
Visual analysis is an iterative process for working with data that exploits the power of the human visual system. The formal core of visual analysis is the mapping of data to appropriate visual representations. Learn what years of research have taught us about designing visualizations people can learn from and understand.
11:30am-12:10pm (40m) Visualization & Interface
Effective Data Visualization
Hjalmar Gislason (DataMarket)
With the rise of big data more and more people need effective visualizations. Needs may range from simple charts to massive interactive network graphs. A range of tools exist, but still many find none that meet all their requirements: Cross-browser usage, server-side rendering, iOS support, full control of look and feel, and your options are suddenly very slim. We share our lessons and approach.
1:30pm-2:10pm (40m) Visualization & Interface
Building a Data Narrative: Discovering Haight Street
Jesper Andersen (Bloom Studios)
See how applying traditional data analysis tools, as well as more esoteric ones like computer vision, to multiple disparate data sets and data types can create a more complete and nuanced narrative of one of San Francisco’s most vibrant streets.
2:20pm-3:00pm (40m) Visualization & Interface
Crafting Meaningful Data Experiences
Bitsy Bentley (GfK Custom Research)
Data visualization is just one tool that designers use to communicate data-driven recommendations. In this session I present a case study on the use of user-centered design practices to craft meaningful and actionable data presentations for business users. Data visualization and UX work best when they work together.
4:00pm-4:40pm (40m) Visualization & Interface
Roll Your Own Front End: A Survey of Creative Coding Frameworks
Michael Edgcumbe (Columbia University) et al
Custom data exploration tools can provide efficient and exciting interfaces for audiences not well served by out-of-the-box business intelligence solutions. Frameworks not only beautify data but also surface novel observations from the set. In this session, we survey the creative coding frameworks that lend themselves to visualization and offer some insight into their strengths and weaknesses.
4:50pm-5:30pm (40m) Visualization & Interface
Sketching With Data
Fabien Girardin (Near Future Laboratory)
In this talk we report on the value of tools that support a human-driven approach to revealing innovation opportunities hidden withing big datasets. Based on our experience in data science projects involving multiple stakeholders we found that sketching with data and rapidly sharing interactive information visualizations is a key practice to transform information into useful services and products.
10:40am-11:20am (40m) Hadoop & Big Data: Applied
The Future of Hadoop: Becoming an Enterprise Standard
Eric Baldeschwieler (Independent)
In this session, Hortonworks CEO Eric Baldeschwieler will look at the current state of Apache Hadoop, how the ecosystem is evolving by working together to close the existing technological and knowledge gaps, and present a roadmap for the future of the project.
11:30am-12:10pm (40m) Hadoop & Big Data: Applied
I Didn't Know You Could Do All that with Hadoop
Jack Norris (MapR Technologies)
This session will draw on numerous customer examples to reveal powerful tips, tricks, and in-depth use cases to show how Hadoop can easily integrate, scale, and analyze important data.
1:30pm-2:10pm (40m) Hadoop & Big Data: Applied
Collaborative Filtering using MapReduce
Sam Shah (LinkedIn)
In this talk, we'll build a complete, scalable collaborative filtering ("people who X also Y") system that is almost identical to what prominent Internet properties use today. We'll talk about model improvements, performance enhancements, and practical considerations. This is a practical talk accessible to all.
2:20pm-3:00pm (40m) Hadoop & Big Data: Applied
Hadoop + JavaScript: what we learned
Asad Khan (Microsoft)
As more companies adopt Hadoop to perform data intensive tasks for large data sets, there is a burning need to make Hadoop available to a broader set of developers. This talk covers two approaches Microsoft is exploring for this purpose: 1. JavaScript interfaces to run Hadoop jobs and 2. web interfaces for Hadoop that let developers write and run MapReduce jobs from any platform.
4:00pm-4:40pm (40m) Data Science
Architecting Virtualized Infrastructure for Big Data
Richard McDougall (VMware)
How do you architect big data systems that leverage virtualization and platform as a service? We will walk through a layered approach to building a unified analytics platform using virtualization, provisioning tools and platform as a service.
4:50pm-5:30pm (40m) Data Science
Aggregating and serving local places data and ads at Citygrid
Ana Martinez (CityGrid Media) et al
Learn how Citygrid built a world class platform to aggregate the data powering it's publicly available local places, content and ads APIs using Hadoop, Solr and MongoDB.
8:45am-8:50am (5m) Keynote
Welcome
Edd Dumbill (Silicon Valley Data Science) et al
Opening remarks by the Strata program chairs, Edd Dumbill and Alistair Croll.
8:50am-9:00am (10m) Keynote
The Apache Hadoop Ecosystem
Doug Cutting (Cloudera)
Apache Hadoop forms the kernel of an operating system for Big Data. This ecosystem of interdependent projects enables institutions to affordably explore ever vaster quantities of data. The platform is young, but it is strong and vibrant, built to evolve.
9:00am-9:10am (10m) Keynote
Do We Have The Tools We Need To Navigate The New World Of Data?
Dave Campbell (Microsoft)
The explosion of data is both a challenge and opportunity for businesses. In order to thrive in this new world, organizations will need a technical strategy for sifting through all of this data and driving insights.
9:10am-9:20am (10m) Keynote
Decoding the Great American ZIP myth
Abhishek Mehta (Tresata)
How big data tools and technologies give us back our individual identity ... because if you didn't know you were unique and special, well, you are. Big data can be applied to solving socio-economic problems that rival the scale and importance of building ad optimization models.
9:20am-9:30am (10m) Keynote
Guns, Drugs and Oil: Attacking Big Problems with Big Data
Mike Olson (Cloudera)
Tools for attacking big data problems originated at consumer internet companies, but the number and variety of big data problems have spread across industries and around the world. I'll present a brief summary of some of the critical social and business problems that we're attacking with the open source Apache Hadoop platform.
9:30am-9:35am (5m) Keynote
Machine Learning and Big Data: Sustainable Value or Hype?
Flavio Villanustre (LexisNexis Risk Solutions and HPCC Systems)
Back in the late 80s artificial intelligence was set to take over the world; it didn’t happen. In 2012; AI has been stripped down, dressed up and reborn as machine learning. Will it take over the world this time? What makes a Big Data - Machine Learning solution ‘better’?
9:35am-9:40am (5m) Keynote
Learning Analytics: What Could You Do With Five Orders of Magnitude More Data About Learning?
Steve Schoettler (Junyo)
The increasing use of online software and digital devices in the classroom provides a source of high-frequency data streams that can be analyzed to better understand student progress, identify individual needs, and develop personal recommendations.
9:40am-9:55am (15m) Keynote
A Big Data Imperative: Driving Big Action
Avinash Kaushik (Market Motive)
So you've hoarded the world's data within your enterprise. Now what? Author and digital marketing evangelist Avinash Kaushik shares lessons from the nascent world of Web Analytics on how multiplicity, scale and outsourcing powers a data democracy, and how that in turn drives business action.
9:55am-10:10am (15m) Keynote
The Information Architecture of Medicine is Broken
Ben Goldacre (Bad Science)
Negative results from clinical trials go missing far too often, leading us to overestimate the benefits of treatments. Attempts to remedy this problem haven't worked well. Ben Goldacre, both a doctor and data geek, will talk about how to fix this, and other, problems in medicine.
10:10am-10:40am (30m)
Break: Morning Break
12:10pm-1:30pm (1h 20m) Event
Wednesday Lunchtime BoF Tables
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on Wed 2/29 and Thu 3/1. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area.
3:00pm-4:00pm (1h)
Break: Afternoon Break sponsored by MarkLogic
5:30pm-6:30pm (1h)
Expo Hall Reception
Grab a drink, mingle with fellow Strata participants, and see the latest technologies and products from leading companies in the data space.
8:00am-8:45am (45m)
Break: Coffee Break sponsored by NetApp
6:30pm-8:00pm (1h 30m) Event
Strata 2012 Startup Showcase
Don't miss Startup Showcase, Strata's live demo program and competition for startups and early-stage companies. With a panel of industry experts providing real-time feedback, Startup Showcase happens during Strata Conference on Wednesday, February 29, 2012.

Sponsors

  • EMC
  • Microsoft
  • HPCC Systems™ from LexisNexis® Risk Solutions
  • MarkLogic
  • Shared Learning Collaborative
  • Cloudera
  • Digital Reasoning Systems
  • Pentaho
  • Rackspace Hosting
  • Teradata Aster
  • VMware
  • IBM
  • NetApp
  • Oracle
  • 1010data
  • 10gen
  • Acxiom
  • Amazon Web Services
  • Calpont
  • Cisco
  • Couchbase
  • Cray
  • Datameer
  • DataSift
  • DataStax
  • Esri
  • Facebook
  • Feedzai
  • Hadapt
  • Hortonworks
  • Impetus
  • Jaspersoft
  • Karmasphere
  • Lucid Imagination
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Scaleout Software
  • Skytree, Inc.
  • Splunk
  • Tableau Software
  • Talend

For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com.

For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
@oreilly.com

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

View a complete list of Strata contacts