Strata New York 2011 Schedule

Below is the preliminary schedule for Strata Conference New York. We'll be confirming more sessions and adding them to this schedule in the coming weeks.

Customize Your Own Schedule

Create your own Strata New York schedule using the personal scheduler function. Mark the sessions, keynotes, and events you want to attend by clicking on the calendar icon [calendar icon] next to each listing. Then click on "personal schedule" below and get your own customized schedule generated.

Sutton North
10:40am Big Data, Emergency Management and Business Continuity Jeannie Stamberger (Carnegie Mellon Silicon Valley Disaster Mgmt Initiative)
11:30am Extracting Microbial Threats From Big Data Robert Munro (Idibon)
1:40pm HunchWorks: Combining human expertise and big data Chris van der Walt (United Nations Global Pulse), Dane Petersen (Adaptive Path), Sara Farmer (UN Global Pulse)
2:30pm Creating a fact-based decision making culture in organizations Amaresh Tripathy (PricewaterhouseCoopers)
4:10pm Gaining New Insights from Massive Amounts of Machine Data Jake Flomenberg (Splunk), Denise Hemke (Salesforce.com)
Sutton South
10:40am Optimising scarce resources using real-time decision making Alasdair Allan (The Thing System, Inc.)
11:30am Navigating the Data Pipeline Tim Moreton (Acunu)
1:40pm The Accidental Chief Privacy Officer Jim Adler (Metanautix)
2:30pm Journey or Destination: Using Models to Explore Big Data Ben Gimpert (Altos Research)
4:10pm Data as the Building Block at Foursquare Justin Moore (Facebook)
Murray Hill Suite A
10:40am Chart Wars: The Political Power of Data Visualization Alex Lundry (TargetPoint Consulting)
1:40pm Big Data Use Cases in the Cloud Peter Sirota (Amazon Web Services), Justin Moore (Facebook)
2:30pm Google Cloud for Data Crunchers Ryan Boyd (Google), Chris Schalk (Google)
4:10pm Data Environmentalism Trevor Hughes (International Association of Privacy Professionals)
5:00pm Hazarding a Guess: ethical, legal, and policy issues in analytics and big data applications Solon Barocas (New York University), Betsy Masiello (Google), Jane Yakowitz (Brooklyn Law School)
Murray Hill Suite B
10:40am LexisNexis: Reinventing New Business with Big Data Ron Avnur (MarkLogic), Mark Rodgers (LexisNexis)
11:30am Big Data Revolution: Benefit from MapReduce Without the Risk Ted Dunning (MapR Technologies)
2:30pm Assembling Data to Fight Breast Cancer Roger Magoulas (O'Reilly Media), Anthony Goldbloom (Kaggle), Trajan Bayly (GE), Nuala O'Connor Kelly (GE), Abdul Shaikh (National Cancer Institute)
4:10pm Big Data Architectures 2.0: Beyond the Elephant Ride Vineet Tyagi (Impetus Technologies)
5:00pm TBC
10:15am Break sponsored by MarkLogic
Room: Gramercy Suite
3:10pm Break sponsored by Cloudera
Room: Gramercy Suite
12:10pm Lunch sponsored by Tableau Software
Room: Rhinelander Gallery
Friday Lunchtime BoF Sessions
8:45am Plenary
Room: Sutton Parlors
Welcome Alistair Croll (Solve For Interesting), Edd Dumbill (Silicon Valley Data Science)
8:50am Plenary
Room: Sutton Parlors
Doing Good With Data: Data Without Borders Jake Porway (DataKind), Drew Conway (IA Ventures)
9:05am Plenary
Room: Sutton Parlors
First, firster, firstest Mark Madsen (Third Nature)
9:25am Plenary
Room: Sutton Parlors
Health Empowerment through Self-Tracking Anne Wright (CMU)
9:40am Plenary
Room: Sutton Parlors
Big Data, Big Opportunity Ken Bado (MarkLogic)
9:45am Plenary
Room: Sutton Parlors
Short URLs, Big Data: Learning About the World in Realtime Hilary Mason (Accel Partners)
10:00am Plenary
Room: Sutton Parlors
Calling for a New Paradigm: Machines Plus Humans Arnab Gupta (Opera Solutions)
10:40am-11:20am (40m) Business
Big Data, Emergency Management and Business Continuity
Jeannie Stamberger (Carnegie Mellon Silicon Valley Disaster Mgmt Initiative)
A deep dive into how big data plays in emergency management, from boots on the ground to business continuity - now and in the future. User interfaces for sense-making will be discussed.
11:30am-12:10pm (40m) Business
Extracting Microbial Threats From Big Data
Robert Munro (Idibon)
We will talk about Global Viral Forecasting's 'EpidemicIQ' project, which tracks all the globe's known and potential disease outbreaks. It is the largest humanitarian application of machine-learning and crowdsourcing to date, dynamically adapting to new threats and data sources in real-time.
1:40pm-2:20pm (40m) Data
HunchWorks: Combining human expertise and big data
Chris van der Walt (United Nations Global Pulse) et al
United Nations Global Pulse and Adaptive Path have been collaborating on a new global crisis impact tool called HunchWorks that allows experts to post hypotheses about emerging crises and crowd source verification. The presentation will focus on lessons learned from a complex project that combines human expertise and big data algorithms using human-centered design and assistive intelligence.
2:30pm-3:10pm (40m) Business
Creating a fact-based decision making culture in organizations
Amaresh Tripathy (PricewaterhouseCoopers)
Analytical culture is the substrate necessary for organizations to turn the promise of data driven decisions to reality. Fostering and developing such a culture is hard. The presentation will focus on a set of organizational and solution design principles gleaned from real world experiences that have proven to be very effective to build an analytical culture.
4:10pm-4:50pm (40m) Business
Gaining New Insights from Massive Amounts of Machine Data
Jake Flomenberg (Splunk) et al
This session examines the challenges and approaches for collecting, organizing and extracting value from machine data – the data generated continuously by all IT systems containing a record of all activity and behavior. Harnessing this data can provide valuable new insights for both IT and business users. This session will be hosted by Splunk's CIO and Salesforce.com.
5:00pm-5:40pm (40m) Business, Data
Why MongoDB Was Created: What I Wish I Knew at DoubleClick
Dwight Merriman (10gen)
This session will introduce the history and philosophy of MongoDB. We'll also review a few key use cases for NoSQL and MongoDB in particular.
10:40am-11:20am (40m) Real-time
Optimising scarce resources using real-time decision making
Alasdair Allan (The Thing System, Inc.)
The recent integration of previously isolated telescopes into expanding smart telescope networks, spanning continents and responding to transient events in seconds, sees novel architectures emerging to help filter science from data in real time. Machine learning and collective decision making are used by these geographically distributed networks to optimise output in the face of scarce resources.
11:30am-12:10pm (40m) Real-time
Navigating the Data Pipeline
Tim Moreton (Acunu)
At the heart of every system that harnesses big data is a pipeline that comprises collecting large volumes of raw data, extract value from it through analytics or data transformations, then delivering that condensed set of results back out -- potentially to millions of users. This talk examines the challenges of building manageable, robust pipelines.
1:40pm-2:20pm (40m) Policy & Ethics
The Accidental Chief Privacy Officer
Jim Adler (Metanautix)
A new breed of chief privacy officer (CPO) is emerging.  An engineer that is comfortable in a product focus group, engineering scrum, or analyzing test results.  They have the historical awareness, frontier spirit, regulatory caution, and technical chops to work through the toughest data issues. The promise of the engineer CPO is that products, not only safeguard privacy, but compete on it.
2:30pm-3:10pm (40m) Data
Journey or Destination: Using Models to Explore Big Data
Ben Gimpert (Altos Research)
All big data models are wrong but some are useful, as George Box might have said. Models are not the end result of a big data architecture, but exploratory tools in their own right. They are most useful when data scientists try to understand the business, and when our users learn a bit about data. How can the actual process of modeling improve a big data system, and teach the organization?
4:10pm-4:50pm (40m) Data
Data as the Building Block at Foursquare
Justin Moore (Facebook)
With over 700 million check-ins, 10 million nodes in the social graph, and billions of cumulative signals, Justin Moore will be explaining how foursquare processes, analyzes, and builds products to help people explore the real world.
5:00pm-5:40pm (40m) Data
Taming Data Logistics - the Hardest Part of Data Science
Ken Farmer (IBM)
While most of the focus in data science is on the rapid analysis of vast volumes of data, the hardest part of most solutions is the data acquisition, movement, transformation, and loading - the "data logistics". This presentation will describe the common challenges and solutions - including the best and worst practices that can be reused from Data Warehousing.
10:40am-11:20am (40m) Interface
Chart Wars: The Political Power of Data Visualization
Alex Lundry (TargetPoint Consulting)
Politically charged data visualization emerged over the last election cycle as a provocative and powerful means of persuasive communication. We’ve seen organizational charts used as protest signs and the White House regularly releases infographics. With these political “chart wars” as a backdrop, this presentation will show you how to be a smart consumer of data visualizations and infographics.
11:30am-12:10pm (40m) Interface
Designing Data Visualizations: Telling Stories With Data
Noah Iliinsky (IBM)
A jumpstart lesson on how to get from a blank page and a pile of data to a useful data visualization. We'll focus on the design process, not specific tools. The talk will include discussion of figuring out what story to tell, selecting right data, and picking appropriate encodings. We'll briefly discuss tools and visualization styles, and look at several examples.
1:40pm-2:20pm (40m) Data
Big Data Use Cases in the Cloud
Peter Sirota (Amazon Web Services) et al
This session will address specific use cases relevant to customers with big data needs. We will highlight customers already successfully utilizing this service as well as showcase top scenarios and explain why it makes sense to leverage the cloud for Big Data needs.
2:30pm-3:10pm (40m) Data
Google Cloud for Data Crunchers
Ryan Boyd (Google) et al
Google is a Data business: over the past few years, many of the tools Google created to store, query, analyze, visualize its data, have been exposed to developers as services. This talk will give you an overview of Google services for Data Crunchers.
4:10pm-4:50pm (40m) Policy & Ethics
Data Environmentalism
Trevor Hughes (International Association of Privacy Professionals)
Data flows with little friction; it’s gathered and stored in increasing volume. Consumer concerns include guarding data against misuse and managing how it’s shared. Low friction data channels are valuable but vulnerable information ecologies. This session shows how data fuels life, and why we must balance regulatory and security controls with recognition of how data flow drives economy and culture
5:00pm-5:40pm (40m) Policy & Ethics
Hazarding a Guess: ethical, legal, and policy issues in analytics and big data applications
Solon Barocas (New York University) et al
A panel discussion that will identify and debate the key ethical, legal, and policy issues in analytics and applications of big data. The panelists will hash out some of the novel privacy concerns, but they will also consider issues around autonomy, fairness, fragmentation, and transparency, as well as the appropriate and available responses to them.
10:40am-11:20am (40m) Sponsored Sessions
LexisNexis: Reinventing New Business with Big Data
Ron Avnur (MarkLogic) et al
Ron Avnur, SVP Engineering, MarkLogic, and Mark Rodgers, Sr. Director of Product Engineering, LexisNexis will reveal how LexisNexis is rebuilding its business platform to handle Big Data in real-time. Avnur and Rodgers will discuss what it means to have Big Data, how the global organization addressed that challenge, and the business benefits resulting from the solution.
11:30am-12:10pm (40m) Sponsored Sessions
Big Data Revolution: Benefit from MapReduce Without the Risk
Ted Dunning (MapR Technologies)
In this talk, Dr. Dunning will outline strategies for integrating big data technologies like Hadoop into existing business data systems. The talk will provide vignettes drawn from real-life situations that expose the challenges customers have faced and the solutions that meet these challenges
1:40pm-2:20pm (40m) Sponsored Sessions
Beyond BI – Transforming Your Business with Big Data Analytics
Steven Hillion (EMC DCD)
Steven Hillion, VP of EMC Greenplum’s Data Analytics Lab lends insight into emerging technologies to take advantage of the big data opportunity and how big data challenges today’s BI architectures and approaches to data management.
2:30pm-3:10pm (40m) Sponsored Sessions
Assembling Data to Fight Breast Cancer
Roger Magoulas (O'Reilly Media) et al
Panel Discussion on Assembling Data to Fight Breast Cancer
4:10pm-4:50pm (40m) Sponsored Sessions
Big Data Architectures 2.0: Beyond the Elephant Ride
Vineet Tyagi (Impetus Technologies)
Businesses today are moving beyond the buzz and experimentation with batch processing options of Hadoop and MapReduce, stretching the limits for cutting edge performance & scalability. This session will talk about emerging trends of a new generation of NoHadoop (Not Only Hadoop) architectures for future proof big data scalability and prepare you for life beyond the elephant ride !
5:00pm-5:40pm (40m)
Session
To be confirmed
10:15am-10:40am (25m)
Break: Break sponsored by MarkLogic
3:10pm-4:10pm (1h)
Break: Break sponsored by Cloudera
12:10pm-1:40pm (1h 30m)
Friday Lunchtime BoF Sessions
Birds of a Feather (BoF) sessions provide face to face exposure to those interested in the same projects and concepts. BoFs can be organized for individual projects or broader topics (best practices, open data, standards). BoF topics are entirely up to you. Sign up on site to lead a conversation during lunch on Friday, September 23.
8:45am-8:50am (5m) Keynote
Welcome
Alistair Croll (Solve For Interesting) et al
Opening remarks by the Strata program chairs, Alistair Croll and Edd Dumbill.
8:50am-9:05am (15m) Keynote
Doing Good With Data: Data Without Borders
Jake Porway (DataKind) et al
Data scientists and technology companies are rapidly recognizing the immense power of data for drawing insights about their impact and operations, yet NGOs and non-profits are increasingly being left behind with mounting data and few resources to make use of it.
9:05am-9:20am (15m) Keynote
First, firster, firstest
Mark Madsen (Third Nature)
The first person to conceive of something is usually not the first. They're the first to re-conceive at a point where the current technology caught up to someone else's idea. We're at a point today where many old ideas are being reinvented. Hear why looking to the past, beyond your core field of interest, is worthwhile.
9:20am-9:25am (5m) Keynote
Dr. Richard Merkin, President and CEO of Heritage Provider Network, Announces the Winner of the First Heritage Health Progress Prize
Richard Merkin (Heritage Provider Network)
Dr. Richard Merkin, President and CEO of Heritage Provider Network, is pleased to announce the winner of the first $3 million dollar Heritage Health Progress Prize.
9:25am-9:40am (15m) Keynote
Health Empowerment through Self-Tracking
Anne Wright (CMU)
The BodyTrack Project is building tools, both technological and cultural, to empower more people to embrace an "investigator" role in their own lives
9:40am-9:45am (5m) Keynote
Big Data, Big Opportunity
Ken Bado (MarkLogic)
Big Data is more than just volume and velocity. MarkLogic CEO Ken Bado will address why complexity is the key gotcha for organizations trying to outflank their competition by managing Big Data in real time. Learn how winners today are using MarkLogic to manage the complexity of their unstructured information to drive revenue and results.
9:45am-10:00am (15m) Keynote
Short URLs, Big Data: Learning About the World in Realtime
Hilary Mason (Accel Partners)
The flow of data across the social web tells us what people, around the world, are paying attention to at any given moment. Understanding this flow is both a mathematical and a human problem, as we develop and adapt techniques to find stories in the data. Come hear about the expected and the surprises in the bitly data, as well as generalized techniques that apply to any 'realtime' data system.
10:00am-10:15am (15m) Keynote
Calling for a New Paradigm: Machines Plus Humans
Arnab Gupta (Opera Solutions)
This talk will discuss the pitfalls of the man versus machine premise while underscoring the need for man and machine to work together in order to make the most of Big Data. We will also address the need to create a new, visual language to allow humans and machines to realize the full potential of their collaboration.

Sponsors

  • Aster Data
  • EMC Greenplum
  • GE
  • Lexis Nexis
  • MarkLogic
  • Tableau Software
  • Cloudera
  • DataStax
  • Informatica
  • DataSift
  • Splunk
  • Amazon Web Services
  • Datameer
  • Impetus
  • Karmasphere
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Sybase
  • Xeround
  • Media-Science
  • Platfora

Sponsorship Opportunities

For information on sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata Contacts