Schedule: Data Science sessions

Location: Room 1-6
Jason Davis (Etsy)
Average rating: ***..
(3.67, 3 ratings)
Late last summer, Etsy made a seemingly innocuous change to its search engine that had far reaching impact. The change was coordinated with three major data-driven product launches, from search to advertising to analytics. Big data can cause big changes, and this talk focuses on big data from an end-to-end product view, ranging from the underlying technology to understanding longer-term impacts. Read more.
Location: Room 1-6
Ryan Boyd (Google), Siddartha Naidu (Google)
Average rating: ***..
(3.25, 4 ratings)
Google’s Dremel is a scalable, interactive ad-hoc query system capable of running SQL-like queries over trillion-row tables in seconds. BigQuery is the externalization of this technology as a REST API and web app. This session will discuss the capabilities of Dremel and dive into the design challenges necessary to make this technology accessible and performant for developers and business users. Read more.
Location: Room 1-6
Klaas Bosteels (Massive Media)
Average rating: ****.
(4.00, 12 ratings)
When I left Last.fm to join Massive Media, I basically moved from a data science forerunner to a newcomer. I had to evaluate everything I learned and start over completely with a clean slate, which resulted in a pretty clear perspective on how to find good data scientists, what they should be doing, what tools they should be using, and how to organize them to work together efficiently as team. Read more.
Location: Room 1-6
Jason McFall (Causata)
Average rating: ****.
(4.00, 5 ratings)
Establishing cause and effect from observational data is extremely difficult. However by introducing randomization, or better still, controlled experiments, it becomes possible to establish true causality. This talk will survey the the difficulties and pitfalls of establishing cause and effect from observed data, and explain ways to introduce experimentation. Read more.
Location: Room 1-6
Heather Stark (Kinran Limited)
Average rating: ****.
(4.20, 5 ratings)
Social games are the poster children of metrics-driven design. The way that analytics is used to optimise design for games has lessons which are transferable to other domains. But even poster children have problems. We look at the landscape of analytical tools designed to support game design refinement, identify the main pitfalls involved in practice, and suggest workarounds. Read more.
Location: Room 1-6
Benjamin Fields (Goldsmiths University of London/Fun & Plausible Solutions)
Average rating: ***..
(3.25, 4 ratings)
When constructing a music recommender system, which is more important: a musicological understanding of the catalog of music in a system or the number of times two particular songs were played one after the other and were `liked’? Even better, if a system knows the latter, does the former even matter? Do machines that predict behavior need to learn to listen? Or is observing behavior enough? Read more.
Location: Room 1-6
Isabel Drost-Fromm (Apache Software Foundation/ Nokia Gate 5 GmbH), Hannes Kruppa (Nokia Maps)
Average rating: **...
(2.83, 6 ratings)
Failing software projects already is easier than we'd love to admit. When dealing with big data - a topic hyped quite a bit - the chance of projects failing miserably are even higher. This talk highlights some of the most prominent anti-patterns when dealing with data analysis, scaling and data science. Read more.
Location: Room 1-6
Average rating: ***..
(3.00, 2 ratings)
A guide to how real world companies have architected their big data ecosystems, incorporating Hadoop, NoSQL and data warehouse technologies. Read more.
Location: Room 1-6
Ted Dunning (MapR)
Average rating: ****.
(4.75, 4 ratings)
Nearest neighbor (k-nn for short) models are conceptually just about the simplest kind of behavioral model possible but are generally considered infeasible for production. This talk will describe the knn project and how it can reduce thousand-year computations to a few hours or make real-time use of k-nn models practical. Practical results will be shown and implementation methods described. Read more.
Location: Bleinheim Room (Sponsored)
Ian Robinson (Neo Technology)
Today's complex data is not only big, but also semi-structured and densely connected. In this session we'll look at how size, structure and connectedness have converged to transform the data landscape. Read more.
Location: Room 1-6
Alasdair Allan (Babilim Light Industries), Zena Wood (University of Exeter)
Average rating: ****.
(4.33, 3 ratings)
Observing how other humans interact is so interesting that we do it recreationally, we call it "people watching". Evolution has equipped us both with a desire to people watch, and with the tools we need to do it, but it's hard to describe what it is we're doing. If we could, we could make our machines people watch for us, potentially yielding novel insights into our own social interactions. Read more.
Location: Room 1-6
Thomas Levine (csv soundsystem)
Average rating: ***..
(3.00, 1 rating)
Masters at web scraping and data journalism from ScraperWiki tell tales and give practical advice from years of cleaning data. What are common gotchas when fixing up data before you make it do something, and how do you get round them? Illustrated with real examples from the world of journalism and business. Read more.
Location: Buckingham Room
Arfon Smith (University of Oxford)
Web-scale citizen science such as Zooniverse (www.zooniverse.org) has provided a temporary solution to the flood of data that confronts researchers of 21st century, however the solution is a short-term one. In this presentation I will outline a potential strategy for combining a large web community and significant compute resources to create a scalable, intelligent classification engine. Read more.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com or +1 (707) 827-7148

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners
@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata contacts.