Strata in London 2012 Schedule

Below are the confirmed and scheduled talks at Strata in London 2012 (schedule subject to change).

Customize Your Own Schedule

Create your own Strata schedule using the personal scheduler function. Mark the sessions, keynotes, and events you want to attend by selecting the calendar icon [calendar icon] next to each listing. Then go to your personal schedule and get your own customized schedule generated.

Room 1-6
14:00 Big Data for the Masses: How We Opened Up the Doors to Google’s Dremel Ryan Boyd (Google), Siddartha Naidu (Google)
16:00 Establishing Cause and Effect from Data Jason McFall (Causata)
17:00 Should We Care About Content? Recommending by Proxy with Big Metadata Benjamin Fields (Musicmetric (Semetric Ltd.))
Thames Suite
13:30 From Alpha to a Data-Driven Product Ben Smith (Top10)
14:00 Making Big Data Small Noel Welsh (Underscore Consulting)
14:30 Clojure: Full Stack Data Science Edmund Jackson (Cambridge Data Science)
Buckingham Room
13:30 Emoto - Visualizing the Global Audience Response to London 2012 Moritz Stefaner (http://moritz.stefaner.eu), Gerrit Kaiser, Drew Hemment (FuturEverything)
14:00 Sensing the City: Mapping and Analysing London’s Population Data Flows James Cheshire (UCL Centre for Advanced Spatial Analysis)
14:30 Big (Sequence) Data in Pre-competetive Pharmaceutical R&D. William Spooner (Eagle Genomics)
16:30 The Craft of Designing Smart Data Applications Lorenz Matzat (Lokaler)
18:30 Plenary
Room: Buckingham Room
BigData London Meetup (Community Event)
King's Suite
9:00 Plenary
Room: King's Suite
Monday Welcome Edd Dumbill (Silicon Valley Data Science), Kaitlin Thaney (Mozilla Science Lab)
9:10 Plenary
Room: King's Suite
Keynote by Liam Maxwell Liam Maxwell (Cabinet Office of the UK Government)
9:25 Plenary
Room: King's Suite
Open Data: Dreams to Reality Jeni Tennison (Open Data Institute)
9:35 Plenary
Room: King's Suite
Good Data, Good Values Jake Porway (DataKind)
9:50 Plenary
Room: King's Suite
Big Data, Big Deal? Mikael Bisgaard-Bohr (Teradata Corporation)
10:05 Plenary
Room: King's Suite
The First 5 Kilobytes are the Hardest George Dyson
11:00 Plenary
Room: King's Suite
Think Like a Data Journalist: How the Guardian Makes Data Useful Kathryn Hurley (Google), Simon Rogers (Guardian)
11:20 Plenary
Room: King's Suite
The Dirty Truth about Data Literacy Kim Rees (Periscopic)
11:35 Plenary
Room: King's Suite
Information Overload Through the Ages - What Can We Learn? Mark Madsen (Third Nature)
11:55 Plenary
Room: King's Suite
The Manifest Destiny of Big Data Kenneth Cukier (The Economist)
14:00 Big Data for the TV Industry: Tune into the Backchannel Sébastien Lefebvre (Mesagraph), Guy Hugot-Derville (Mesagraph)
14:30 Situation Normal Everything Must Change Simon Wardley (Leading Edge Forum (CSC))
16:00 Health & Wealth - The Potential and Challenge of Healthcare Data Louise Marston (Nesta), Laura Bunt (Nesta)
16:30 The New Genomics Matt Wood (Amazon Web Services)
17:00 Self-Hacking: Self-Knowledge & Data Literacy Adriana Lukas (London QS group)
10:30 Morning Break - Sponsored by Hortonworks
Room: Monarch Suite
12:15 Lunch sponsored by Teradata
Room: Monarch Suite
Monday Lunchtime BoF Tables
14:55 Afternoon Break/ Startup Showcase/ Hands-on Internet of Things
Room: Monarch Suite
Afternoon Break / Startup Showcase/ Hands-on Internet of Things
17:30 Attendee Reception/ Startup Showcase / Hands-on Internet of Things
Room: Monarch Suite
Attendee Reception / Startup Showcase/ Hands-on Internet of Things
Bleinheim Room (Sponsored)
14:30 Big Data is the New Oil? Oil is the Old Big Data… Duncan Irving (Teradata Corporation)
16:00 Getting Real Time Value from Your Data Eddie Satterly (Splunk)
16:30 Telling Great Stories with Data Online Andy Cotgreave (Tableau)
17:00 Big Data Analytics Providing Market Insight Brendan Moran (EMC UK&I)
13:30-13:55 (25m) Data Science
Big Data, Big Changes: Data-Driven Product Development at Etsy
Jason Davis (Etsy)
Late last summer, Etsy made a seemingly innocuous change to its search engine that had far reaching impact. The change was coordinated with three major data-driven product launches, from search to advertising to analytics. Big data can cause big changes, and this talk focuses on big data from an end-to-end product view, ranging from the underlying technology to understanding longer-term impacts.
14:00-14:25 (25m) Data Science
Big Data for the Masses: How We Opened Up the Doors to Google’s Dremel
Ryan Boyd (Google) et al
Google’s Dremel is a scalable, interactive ad-hoc query system capable of running SQL-like queries over trillion-row tables in seconds. BigQuery is the externalization of this technology as a REST API and web app. This session will discuss the capabilities of Dremel and dive into the design challenges necessary to make this technology accessible and performant for developers and business users.
14:30-14:55 (25m) Data Science
Back to Square One: Building a Data Science Team from Scratch
Klaas Bosteels (Massive Media)
When I left Last.fm to join Massive Media, I basically moved from a data science forerunner to a newcomer. I had to evaluate everything I learned and start over completely with a clean slate, which resulted in a pretty clear perspective on how to find good data scientists, what they should be doing, what tools they should be using, and how to organize them to work together efficiently as team.
16:00-16:25 (25m) Data Science
Establishing Cause and Effect from Data
Jason McFall (Causata)
Establishing cause and effect from observational data is extremely difficult. However by introducing randomization, or better still, controlled experiments, it becomes possible to establish true causality. This talk will survey the the difficulties and pitfalls of establishing cause and effect from observed data, and explain ways to introduce experimentation.
16:30-16:55 (25m) Data Science
Discovery-driven Design in Social Games: Techniques, Processes, and Problems
Heather Stark (Kinran Limited)
Social games are the poster children of metrics-driven design. The way that analytics is used to optimise design for games has lessons which are transferable to other domains. But even poster children have problems. We look at the landscape of analytical tools designed to support game design refinement, identify the main pitfalls involved in practice, and suggest workarounds.
17:00-17:25 (25m) Data Science
Should We Care About Content? Recommending by Proxy with Big Metadata
Benjamin Fields (Musicmetric (Semetric Ltd.))
When constructing a music recommender system, which is more important: a musicological understanding of the catalog of music in a system or the number of times two particular songs were played one after the other and were `liked’? Even better, if a system knows the latter, does the former even matter? Do machines that predict behavior need to learn to listen? Or is observing behavior enough?
13:30-13:55 (25m) Nerdcore
From Alpha to a Data-Driven Product
Ben Smith (Top10)
A practical step-by-step description of how the LAMP based Top10 Alpha was turned into fully data-driven product. Based around a real-time data processing pipeline and asynchronous stack, Top10's infrastructure now hinges on AKKA, along with Scala, Nodejs and a host of other technologies. This has enabled interesting uses of the data and new, exciting user-facing features.
14:00-14:25 (25m) Nerdcore
Making Big Data Small
Noel Welsh (Underscore Consulting)
Big data often doesn't sit well with companies that want to move fast. Technologies like Hadoop can be expensive to setup, slow to produce results, and time consuming to maintain. Streaming algorithms provide an alternative. They are simple to implement, very efficient, and give real-time results. In this talk I will describe several key streaming algorithms, and give examples of their use.
14:30-14:55 (25m) Nerdcore
Clojure: Full Stack Data Science
Edmund Jackson (Cambridge Data Science)
Data Science projects are difficult to realise as they require both mathematical and IT abstractions at once. We need databases, linear algebra, message queues... all at once. Traditional environments like Java/C#/Matlab/Mathematica provide only one. I will talk about the new language, Clojure, provides all the platform power of the JVM, as well as the language and libraries to do data science.
16:00-16:25 (25m) Nerdcore
Introduction to Cascalog: Logic Programming for Hadoop
Stefan Hübner (Nokia)
Logic programming recently gained new interest with people processing large data volumes with Hadoop. This talk demonstrates the basic concepts by using Cascalog.
16:30-16:55 (25m) Nerdcore
Mapreduce and Hadoop Algorithms in Academic Papers - An Overview
Amund Tveit (Atbrox)
This presentation will give an overview of mapreduce-based algorithms described in recent papers written by academic and industrial researchers. Included areas: AI/Machine Learning, Bioinformatics, Information Retrieval. Focus will be on patterns of problems and the corresponding mapreduce solution patterns. Some background material: http://mapreducepatterns.org
17:00-17:25 (25m) Nerdcore
Handling RDF data with tools from the Hadoop ecosystem
Paolo Castagna (Cloudera)
As open data and linked data communities grow, so do the number and average size of freely available datasets. Often these datasets are modelled and interlinked using RDF. This talk shares tips and tricks, use cases and practical examples of how to effectively use tools from the Hadoop ecosystem to process large RDF datasets.
13:30-13:55 (25m) Visualization & Interface
Emoto - Visualizing the Global Audience Response to London 2012
Moritz Stefaner (http://moritz.stefaner.eu) et al
emoto is a unique data art project that sets out to visualise the worldwide emotional response to the Olympic Games 2012. We track social media sites for emotional status messages related to the Games and visualise the crowd's reponse in real-time and in an aggregate data sculpture. This talk will show behind the scenes material on how to master real-time large scale social media mining.
14:00-14:25 (25m) Visualization & Interface
Sensing the City: Mapping and Analysing London’s Population Data Flows
James Cheshire (UCL Centre for Advanced Spatial Analysis)
Overview of cutting edge research from UCL's Centre for Advanced Spatial Analysis.
14:30-14:55 (25m) Business & Industry
Big (Sequence) Data in Pre-competetive Pharmaceutical R&D.
William Spooner (Eagle Genomics)
Does pre-competetive collaboration ease the pain of adopting disruptive big-data technologies? This question is tacked using the example of management/analysis of large genomic sequence data sets, and their role in the development of personalised medicine.
16:00-16:25 (25m) Visualization & Interface
How to Avoid Some Common Graphical Mistakes
Naomi Robbins (NBR)
Readers and preparers of graphs: Learn to recognize and avoid some common graphical mistakes to understand your data better and make better decisions from data.
16:30-16:55 (25m) Visualization & Interface
The Craft of Designing Smart Data Applications
Lorenz Matzat (Lokaler)
Enabling data as an asset and matching it to the needs of customers and/or citizens is a problem. There is a need for careful designed feature sets and conclusive usability in data applications. The talk is about success and failure in producing data application for newspapers and building an Open Data-friendly geo-data-startup.
17:00-17:25 (25m) Visualization & Interface
Data Experience Design - New design methods for data visualisation
Max Gadney (After The Flood)
What are the new tools to help us make useful products and media from Big Data? How do we re-tool the methods from traditional User Experience Design and create a new discipline of Data Experience Design? How do we involve business, tech and user needs to create novel and useful experiences?
18:30-21:30 (3h)
BigData London Meetup (Community Event)
A special edition of the Big Data London Meetup will take place at Strata Conference venue on the evening of day 1, promising for a great crowd and amazing talks.
9:00-9:10 (10m)
Monday Welcome
Edd Dumbill (Silicon Valley Data Science) et al
Program Chairs, Edd Dumbill and Kaitlin Thaney, welcome you to Strata in London
9:10-9:25 (15m)
Keynote by Liam Maxwell
Liam Maxwell (Cabinet Office of the UK Government)
Liam Maxwell, Executive Director of the IT Reform Group in the Cabinet Office
9:25-9:35 (10m)
Open Data: Dreams to Reality
Jeni Tennison (Open Data Institute)
Jeni Tennison, Technical Director of the newly formed Open Data Institute, will describe the ODI’s twin aims of helping data owners achieve their organisational objectives through publishing open data, and helping those who reuse that data to add value responsibly and effectively, thereby turning open data dreams into reality.
9:35-9:50 (15m)
Good Data, Good Values
Jake Porway (DataKind)
For all of our machine learning algorithms and big data tools, so many of the problems we solve day-to-day are decidedly "first world": figuring out how to get the biggest ROI on ad dollars or crafting personalized movie recommendations. Can we use our skills as data scientists to solve social problems as well, helping people find clean water as easily as they can find good restaurants?
9:50-10:05 (15m) Sponsored
Big Data, Big Deal?
Mikael Bisgaard-Bohr (Teradata Corporation)
What will Big Data mean to us as users, consumers and organisations? And will it really be a big deal? In this presentation Mikael Bisgaard-Bohr will provide a fascinating view into where the Big Data wave is taking us, and why it is about so much more than just data.
10:05-10:30 (25m)
The First 5 Kilobytes are the Hardest
George Dyson
Mapping real-world correspondence to data structures populating a storage matrix currently expanding by some 5 trillion bits per second is the challenge that brings us here.
11:00-11:20 (20m)
Think Like a Data Journalist: How the Guardian Makes Data Useful
Kathryn Hurley (Google) et al
Data provides critical insight into the way government works. When the UK government published every item of spending over £25,000, the data was hard to parse. The UK Guardian Datablog cleaned it up and asked readers to help pore through the numbers, making everyone a data journalist. We’ll cover the technologies the Guardian uses to analyze, visualize, and share data with the world.
11:20-11:35 (15m)
The Dirty Truth about Data Literacy
Kim Rees (Periscopic)
Nobody knows statistics. They are as esoteric as chemical compounds are to chemistry. Yet data visualizations often incorporate a logarithmic scale, density traces, or seasonally adjusted numbers among other things. If this is the data deluge, we're bound to find everyone swept downstream. How do we prepare the average data consumer?
11:35-11:55 (20m)
Information Overload Through the Ages - What Can We Learn?
Mark Madsen (Third Nature)
The real challenge ahead of us is not accumulating more information, or processing more information, or analytics, or replacing relational databases, or scaling data (i.e. not the 3 Vs). The real challenge is solving the information glut problem.
11:55-12:10 (15m)
The Manifest Destiny of Big Data
Kenneth Cukier (The Economist)
Everyone uses the term big data but no on can agree on what it means or even if it's novel. However the label is useful to describe the radically new ways that the world interacts with information - for which the public, policymakers and even data geeks, are unprepared.
13:30-13:55 (25m) Business & Industry
Using Data in Retail Supply Chains – Case Studies from Tesco
Tom Hebbert (Tesco)
Tesco is best known for using data in customer segmentation with Clubcard. But it's also important in optimising a supply chain which moves 32,000,000 cases of food each week. Tom will talk about real-life applications of data science – from managing the impact of weather on sales to optimising truck loads. And share experiences about translating data science into real business change.
14:00-14:25 (25m) Business & Industry
Big Data for the TV Industry: Tune into the Backchannel
Sébastien Lefebvre (Mesagraph) et al
In the TV industry, audience is king. Until now, viewers were hidden behind their TV set. Today, Twitter is rapidly changing the game. People share their emotions publicly, in real-time. Capturing these conversations allows advertisers, marketers and TV Channels to discover their audience at a granular level never seen before.
14:30-14:55 (25m) Business & Industry
Situation Normal Everything Must Change
Simon Wardley (Leading Edge Forum (CSC))
This session explores these concepts by first laying out the fundamentals of change and how all industry evolves through a commonly re-occuring pattern. Using this we will examine why one size never fits all in management, the explosion of change from big data and cloud computing to why new forms of organisation are emerging that differ from traditional companies.
16:00-16:25 (25m) Data That Matters
Health & Wealth - The Potential and Challenge of Healthcare Data
Louise Marston (Nesta) et al
Healthcare could be transformed by data. It is the subject of intense debate by clinicians, regulators, and politicians; too often patients are left out. Louise Marston and Laura Bunt from Nesta, the UK innovation agency, will discuss how the challenges in healthcare data exemplify the problems of personal data, and describe a vision of patients using data for their healthcare decisions in future.
16:30-16:55 (25m) Data That Matters
The New Genomics
Matt Wood (Amazon Web Services)
With the arrival of low cost DNA sequencing, genomics is moving ever closer to clinical practice. This presentation discusses how bioinformaticians are meeting the challenges of harnessing the value of ever growing collection of valuable genomic data.
17:00-17:25 (25m) Data That Matters
Self-Hacking: Self-Knowledge & Data Literacy
Adriana Lukas (London QS group)
Quantifying one's self, a growing trend, is about self-awareness, pattern spotting & behaviour change. What is missing is "data literacy" i.e. data expertise at individual level, not just for businesses and institutions.Uncovering hidden cause and effect in one's behaviour increases individual's autonomy and for that we need to have access to analytical tools and raw data. How & where to get them?
10:30-11:00 (30m)
Break: Morning Break - Sponsored by Hortonworks
12:15-13:30 (1h 15m)
Monday Lunchtime BoF Tables
Birds of a Feather (BoF) sessions are informal roundtable discussions happening during lunch on both days of the conference. You can join any BoF table or start your own with a topic of your choice. The BoF sign-up board will be near the Registration area.
14:55-16:00 (1h 5m)
Afternoon Break / Startup Showcase/ Hands-on Internet of Things
Startup Showcase and Hands-on Internet of Things both kick off during the afternoon break on Monday, and continue again during the Attendee Reception--all held at the Sponsor Pavilion.
17:30-18:30 (1h)
Attendee Reception / Startup Showcase/ Hands-on Internet of Things
Grab a drink, mingle with fellow attendees, and see the latest in big data technologies and products from leading companies at the Attendee Reception - happening Monday evening immediately following afternoon sessions. We'll also continue hosting Startup Showcase and Hands-on Internet of Things during the reception.
13:30-13:55 (25m) Sponsored
Combining the Power of Hadoop MapReduce with Object-based Dispersed Storage
Russ Kennedy (Cleversafe)
Massive analytics has emerged as an offshoot of Big Data with tremendous upside potential for businesses that can figure out how to manage that data. CIOs must reduce the TCO to support massive data computation while enhancing analysis workflows. This session explores new capabilities for combined storage and computation with Hadoop MapReduce to solve today’s Big Data challenges
14:00-14:25 (25m) Sponsored
Powering Next-Generation Data Architectures with Apache Hadoop
Shaun Connolly (Hortonworks)
In this talk Shaun Connolly, VP Corporate Strategy for Hortonworks, will look at Hadoop's opportunity and the value it can unlock. Along the way he will discuss the kind of efforts required from the community, the solution ecosystem, and the enterprise in order to solidify Hadoop's place within the enterprise.
14:30-14:55 (25m) Sponsored
Big Data is the New Oil? Oil is the Old Big Data…
Duncan Irving (Teradata Corporation)
Oil exploration provides insight into the world of big data: huge data volumes have driven production for decades, and subsurface machine sensor data is being assimilated at ever-increasing rates. This example shows how a big data analytical ecosystem integrates relational decision support with the wild world of big data – here including seismic imaging and reservoir modeling – for exploitation.
16:00-16:25 (25m) Sponsored
Getting Real Time Value from Your Data
Eddie Satterly (Splunk)
With all of the data that is now available how do you put it at the fingertips of the analysts and derive real business value? In this session we will look at a blueprint for how to do just that and change the way your business works and thinks.
16:30-16:55 (25m) Business & Industry, Visualization & Interface
Telling Great Stories with Data Online
Andy Cotgreave (Tableau)
This session will focus on going from a good visualization to a great visualization by focusing on organization, user interface, and formatting.
17:00-17:25 (25m) Sponsored
Big Data Analytics Providing Market Insight
Brendan Moran (EMC UK&I)
This session is a must for Innovators, Business Sponsors and Data Scientists where you will hear about Greenplum’s Unified Analytics Platform – the only analytics focused collaboration environment in the world today. Learn how you can bring the best of your ideas to market sooner and truly transform you business.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com or +1 (707) 827-7148

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts.