Strata New York 2011 Schedule

Below is the preliminary schedule for Strata Conference New York. We'll be confirming more sessions and adding them to this schedule in the coming weeks.

Customize Your Own Schedule

Create your own Strata New York schedule using the personal scheduler function. Mark the sessions, keynotes, and events you want to attend by clicking on the calendar icon [calendar icon] next to each listing. Then click on "personal schedule" below and get your own customized schedule generated.

Sutton North
1:40pm Marketing with Data Joseph Adler (Interana, Inc.)
Sutton South
10:40am Dedupe, Merge, and Purge: the Art of Normalization Tyler Bell (Factual), Leo Polovets (Factual)
11:30am Bringing the Rest of the World Into Your Data Warehouse Philip (Flip) Kromer (Infochimps, a CSC Big Data Business)
1:40pm Entities, Relationships, and Semantics: the State of Structured Search Daniel Tunkelang (LinkedIn), Andrew Hogue (Foursquare), Breck Baldwin (Alias-i), Evan Sandhaus (New York Times), Wlodek Zadrozny (IBM)
2:30pm Big (Bad) Data Elizabeth Charnock (Cataphora)
5:00pm Big Data and Big Analytics: SciDB is not Hadoop Paul Brown (Paradigm4 Inc.)
Murray Hill Suite A
11:30am Humble pie: helping the Guardian chart big stories through small details Alastair Dant (Guardian News and Media)
1:40pm How to Avoid Some Common Graphical Mistakes Naomi Robbins (NBR)
5:00pm 1M. 10M. 100M. Data! Monica Rogati (Jawbone)
Murray Hill Suite B
10:40am Apache Cassandra™ 1.0: Ready for the Enterprise. Jonathan Ellis (DataStax)
2:30pm Gaining Adoption Through Data Visualization Lee Feinberg (DecisionViz)
5:00pm Telecom Network Switches: Big Value from Big Data Jim Falgout (Pervasive Software)
10:15am Break sponsored by Media-Science
Room: Gramercy Suite
3:10pm Break sponsored by Informatica
Room: Gramercy Suite
5:40pm Plenary
Room: Gramercy Suite
Attendee Reception
12:10pm Lunch sponsored by EMC Greenplum
Room: Rhinelander Gallery
Thursday Lunchtime BoF Sessions
8:45am Plenary
Room: Sutton Parlors
Welcome Edd Dumbill (Silicon Valley Data Science), Alistair Croll (Solve For Interesting)
9:00am Plenary
Room: Sutton Parlors
Data-Driven Innovation: How Open Government is Transforming New York City Rachel Sterne (City of New York)
9:15am Plenary
Room: Sutton Parlors
A Profusion of Exoplanets: NASA's Kepler Mission Jon Jenkins (NASA)
9:30am Plenary
Room: Sutton Parlors
Best of the Best:Announcing Winners of Strata/Tableau Data Visualization Contest Elissa Fink (Tableau Software)
9:35am Plenary
Room: Sutton Parlors
9/11 and The Weight of Data Jer Thorp (The New York Times)
9:50am Plenary
Room: Sutton Parlors
Simplifying Big Analytics for the Business Randy Lea (Teradata Corporation)
10:00am Plenary
Room: Sutton Parlors
What is a Career in Big Data? John Rauser (Pinterest)
6:40pm Plenary
Room: Sutton Parlor Foyer
Strata Mini Maker Faire
10:40am-11:20am (40m) In Practice
Data Challenges in Astronomy: NASA's Kepler Mission and the Search for Extrasolar Earths
Jon Jenkins (NASA)
The Kepler spacecraft launched on March 7, 2009, initiating NASA's first search for Earth-size planets orbiting Sun-like stars, with stunning results after being on the job for just over two years. Designing and building the Kepler science pipeline software that processes and analyzes the resulting data to make the discoveries presented a daunting set of challenges.
11:30am-12:10pm (40m) Business
The Human Dimension: Organizational and Social Challenges of Business Analytics
John Lucker (Deloitte)
Analytics projects are often bedeviled – or simply stopped in their tracks – by challenges emanating from organizational culture, misunderstanding of statistical concepts, and discomfort with probabilistic reasoning. This talk will provide a number of case studies and offer practical tips for achieving organizational buy-in.
1:40pm-2:20pm (40m) Business
Marketing with Data
Joseph Adler (Interana, Inc.)
One of the best ways to use data is for marketing purposes: to help deliver the most relevant products to your customers. This talk will describe LinkedIn's approach to marketing: the philosophy behind our marketing efforts, the way that we use data, and the technical systems that we use to do personalized marketing at scale.
2:30pm-3:10pm (40m) Business
Data prediction competitions: What Archimedes and Roger Bannister can teach us about the business of data
Jeremy Howard (Kaggle)
'Crowdsourcing big data' might sound like a randomly generated selection of buzz words, but it turns out to represent a powerful leap forward in the accuracy of predictive analytics. This session will explore the reasons why this is the case, using case studies from the fields of astronomy, sports ratings systems and tourism forecasting.
4:10pm-4:50pm (40m) Business
Do it Right – Proven Techniques for Exploiting Big Data Analytics
Bill Schmarzo (EMC Consulting)
“Big data” provides the opportunity to combine new, rich data sources in novel ways to discover business insights. How do you use analytics to exploit this data so that it will yield real business value? Learn a proven technique that ensures you identify where and how big data analytics can be successfully deployed within your organization. Case study examples will demonstrate its use.
5:00pm-5:40pm (40m) Business
When Elephants Mate—Will Hadoop Transform Banking?
Abhishek Mehta (Tresata)
How the core principles of Hadoop address the core problems in banking … and will lead to a transformation of an industry in need of one.
10:40am-11:20am (40m) Data
Dedupe, Merge, and Purge: the Art of Normalization
Tyler Bell (Factual) et al
Factual creates canonical reference sets of 40 million entities from over 2.5 billion fragmentary inputs. This talk explains the Hadoop-based science of our approach combined with what we believe to be a necessary art -- the application of domain-specific knowledge -- in creating pragmatic data services.
11:30am-12:10pm (40m) Data
Bringing the Rest of the World Into Your Data Warehouse
Philip (Flip) Kromer (Infochimps, a CSC Big Data Business)
You’ve collected a ton of data and your team is busily crunching numbers and coming to conclusions... but are they the right ones? You can only know with the right context and you can’t get context working in a silo. We invite you to bring the rest of the world into your data warehouse. Don’t worry, it’ll add more value than it takes and instead of working on the data, you can work on your vision.
1:40pm-2:20pm (40m) Data
Entities, Relationships, and Semantics: the State of Structured Search
Daniel Tunkelang (LinkedIn) et al
Structured search improves the search experience through the identification of entities and their relationships in documents and queries. This panel will explore the current state of structured and semi-structured search, as well as exploring the open problems in an area that promises to revolutionize information seeking.
2:30pm-3:10pm (40m) Data
Big (Bad) Data
Elizabeth Charnock (Cataphora)
Experts say that there is no such thing as clean data, yet every day critical decisions are being made based on electronic data. Elizabeth Charnock, author of E-Habits, will discuss how to make decisions based on digital character, and not on individual bits or bytes.
4:10pm-4:50pm (40m) Data
Agile Clouds for Big Data: Empowering the Data Scientist
Richard McDougall (VMware)
In this session we'll discuss strategies for building agile big-data clouds that make it much faster and easier for data scientists to discover, provision and analyze data. We'll discuss where and how new technologies (both vendor and OSS) fit into this model.
5:00pm-5:40pm (40m) Data
Big Data and Big Analytics: SciDB is not Hadoop
Paul Brown (Paradigm4 Inc.)
The science and commercial worlds share requirements for a high performance informatics platform to support collection, curation, collaboration, exploration, and analysis of massive datasets. SciDB is an open source analytical database that provides better analytical performance than relational databases as well as supports key features such as provenance and versioning.
10:40am-11:20am (40m) Interface
Data Visualization - where normal people fall in love with data
Hjalmar Gislason (DataMarket)
Statistics, math and data analysis would easily make most people's "Top 10 Most Boring Topics" list. But an effective data visualization can bring new insights, raise awareness and tell a great story. We want to share our insights from efforts to enable data visualizations on top of massive amounts of data, explore good - and bad - examples and share some of the tools and techniques we use.
11:30am-12:10pm (40m) Interface
Humble pie: helping the Guardian chart big stories through small details
Alastair Dant (Guardian News and Media)
Widespread reaction to the recent phone hacking story prompted the Guardian to capture and visualize Twitter traffic during key events. Find out how we produced interactive interfaces that enable readers to make sense of over 1.5 million tweets in a few minutes.
1:40pm-2:20pm (40m) Interface
How to Avoid Some Common Graphical Mistakes
Naomi Robbins (NBR)
Readers and preparers of graphs: Learn to recognize and avoid some common graphical mistakes to understand your data better and make better decisions from data.
2:30pm-3:10pm (40m) Interface
The Charts You Want Might Not Be the Charts You Need
Irene Ros (Bocoup)
This talk will introduce the concept of "responsible data visualization" in the context of two distinct uses: exploration and narrative. Using personal and industry examples to show best and worst practices in each approach, this talk will offer practical suggestions to bringing data visualization into one's data workflow.
4:10pm-4:50pm (40m) Data
Data Science from the Perspective of an Applied Economist
Scott Nicholson (Poynt)
Economists utilize a data analysis toolkit and intuition that can be very helpful to Data Scientists. In particular, econometric methods are quite useful in disentangling correlation and causation, a use case not well-handled by standard machine learning and statistical techniques. This session will cover examples of econometric methods in action, as well as other economics-related insights.
5:00pm-5:40pm (40m) Data
1M. 10M. 100M. Data!
Monica Rogati (Jawbone)
How do data infrastructure, insights and products change when your user base grows by orders of magnitude?
10:40am-11:20am (40m) Sponsored Sessions
Apache Cassandra™ 1.0: Ready for the Enterprise.
Jonathan Ellis (DataStax)
The Apache Cassandra database has added many new enterprise features this year based on the real-world needs of companies like Twitter, Netflix, Openwave, and others building massively scalable systems. This talk will focus on the shift to real-time data driven applications and what that means and why Cassandra is ideal for today’s enterprise data applications.
11:30am-12:10pm (40m) Sponsored Sessions
MapReduce for the Rest of Us: Unlocking Data Science for the Business User
Tasso Argyros (Teradata Aster)
This session will explore a new class of analytic platforms and technologies such as SQL-MapReduce® which bring the science of data to the art of business.
1:40pm-2:20pm (40m) Sponsored Sessions
How Thomson Reuters Finds a Needle in Many Haystacks within Seconds
Steve Jackson (Thomson Reuters)
Learn how Thomson Reuters manages and processes a variety of very large and diverse data sources to quickly publish timely, trusted, and relevant information to their clients.
2:30pm-3:10pm (40m) Sponsored Sessions
Gaining Adoption Through Data Visualization
Lee Feinberg (DecisionViz)
Gaining Adoption Through Data Visualization
4:10pm-4:50pm (40m) Sponsored Sessions
How Hadoop is Revolutionizing Business Intelligence and Advanced Data Analytics
Amr Awadallah (Cloudera, Inc.)
Dr. Amr Awadallah, CTO at Cloudera, illustrates how Apache Hadoop is changing the business intelligence data stack, and how the evolving architecture delivers advanced capabilities for solving key business challenges. By enabling the complete value to be derived from both unstructured and structured data, organizations are able to ask and get answers to previously un-addressable big questions.
5:00pm-5:40pm (40m) Sponsored Sessions
Telecom Network Switches: Big Value from Big Data
Jim Falgout (Pervasive Software)
Telecom network switches, network servers and other equipment generate and store large amounts of data every day. The data is mainly used for billing and network operations, If utilized fully, this data can have an enormous impact on network operations and overall profitability.
10:15am-10:40am (25m)
Break: Break sponsored by Media-Science
3:10pm-4:10pm (1h)
Break: Break sponsored by Informatica
5:40pm-6:40pm (1h)
Attendee Reception
Join us in the immediately following sessions at Strata Conference. Have a drink or two, network with other Conference attendees, and visit our Sponsors who are innovating in the data space.
12:10pm-1:40pm (1h 30m)
Thursday Lunchtime BoF Sessions
Birds of a Feather (BoF) sessions provide face to face exposure to those interested in the same projects and concepts. BoFs can be organized for individual projects or broader topics (best practices, open data, standards). BoF topics are entirely up to you. Sign up on site to lead a conversation during lunch on Thursday, September 22.
8:45am-9:00am (15m) Keynote
Welcome
Edd Dumbill (Silicon Valley Data Science) et al
Opening remarks by the Strata program chairs, Edd Dumbill and Alistair Croll.
9:00am-9:15am (15m) Keynote
Data-Driven Innovation: How Open Government is Transforming New York City
Rachel Sterne (City of New York)
From hackathons to API-enabled civic data, learn how New York City government is evolving thanks to deeper engagement with the technology community.
9:15am-9:30am (15m) Keynote
A Profusion of Exoplanets: NASA's Kepler Mission
Jon Jenkins (NASA)
The Kepler Mission began its science observations just over two years ago on May 12, 2009, initiating NASA’s first search for Earth-like planets. Initial results and light curves from Kepler are simply breath-taking, including confirmation of the first unquestionable rocky planet, Kepler-10b, and Kepler-11b, a system of 6 transiting planets orbiting one Sun-like star.
9:30am-9:35am (5m) Keynote
Best of the Best:Announcing Winners of Strata/Tableau Data Visualization Contest
Elissa Fink (Tableau Software)
Sometimes the hardest part about making a viz is knowing where to start. Check out the winning vizzes from the Strata/Tableau Data Visualization Contest and get inspired to create your own beautiful visualizations.
9:35am-9:50am (15m) Keynote
9/11 and The Weight of Data
Jer Thorp (The New York Times)
In this presentation, Jer Thorp will discuss his work with names--designing an arrangement algorithm for the 9/11 Memorial in Manhattan. He’ll walk through collaborative processes, admit to a series of failures and ultimately show how humans and software can combine to solve extraordinary problems.
9:50am-10:00am (10m) Keynote
Simplifying Big Analytics for the Business
Randy Lea (Teradata Corporation)
This session will show you how you can bring the science of data to the art of business and empower more business users and analysts to operationalize insights and drive results.
10:00am-10:15am (15m) Keynote
What is a Career in Big Data?
John Rauser (Pinterest)
Quantitative Engineer? Business Intelligence Analyst? Data Scientist? The data deluge has come upon us so quickly that we don't even know what to call ourselves, much less how to make a career of working with data. This talk examines the critical traits that lead to success by looking back to what may be the first act of data science.
6:40pm-8:40pm (2h)
Strata Mini Maker Faire
Join us after the Attendee Reception on Thursday, September 22 for the Strata Mini Maker Faire - a gallery of interactive, sensor-laden projects and robots.

Sponsors

  • Aster Data
  • EMC Greenplum
  • GE
  • Lexis Nexis
  • MarkLogic
  • Tableau Software
  • Cloudera
  • DataStax
  • Informatica
  • DataSift
  • Splunk
  • Amazon Web Services
  • Datameer
  • Impetus
  • Karmasphere
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Sybase
  • Xeround
  • Media-Science
  • Platfora

Sponsorship Opportunities

For information on sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata Contacts