Realtime Analytics at Twitter

Kevin Weil (Twitter, Inc.)
Practitioner
Location: Mission City M
Average rating: ***..
(3.82, 17 ratings)

Most analytics systems rely on large offline computations, which means results come in hours or days behind. Twitter is a realtime system, and it’s critical that our analytics be realtime as well. But with over 160 million users producing over 90 million tweets per day, we needed infrastructure that scaled horizontally in addition to being realtime. In this talk we’ll discuss the high-volume realtime analytics system we have built, as well as the benefits and challenges of this model over standard offline models. We’ll look at some products that we are building on top of this infrastructure, and discuss where we’re hoping to take this system in the future.

Photo of Kevin Weil

Kevin Weil

Twitter, Inc.

Kevin Weil specializes in the technology behind distributed systems, parallel processing, and analytics, especially in the context of large datasets. He currently leads the analytics team at Twitter, using Hadoop and other big data analytics tools to crunch Twitter’s massive data set and apply those learnings to improve the product. Prior to Twitter, he was the first employee at next-generation web media startup Cooliris, backed and incubated by Kleiner Perkins. At Cooliris, he innovated on user growth and advertising-focused analytics on a server cluster running Hadoop, Pig, and Hive—open-source implementations of the Google technology stack that are central to companies like Facebook, Yahoo, and A9. Mr. Weil has also worked at municipal wireless network provider Tropos Networks, where he optimized the performance of citywide wireless mesh networks. He has also worked at Microsoft Research and at SLAC.

Comments on this page are now closed.

Comments

Lyndsay Noble
02/04/2011 7:58am PST

extremely useful information that was well presented, thanks!

Deon Griessel
02/03/2011 2:22pm PST

Not enough seating in room. Poor AV. Good content.

Sponsors

  • Thomson Reuters
  • EMC Data Computing Division
  • EnterpriseDB
  • Microsoft
  • Gnip
  • Rackspace Hosting
  • IBM
  • Windows Azure MarketPlace DataMarket
  • Amazon Mechanical Turk
  • Amazon Web Services
  • Aster Data
  • Cloudera
  • Clustrix
  • DataStax, Inc. (formerly Riptano, Inc.)
  • Digital Reasoning Systems
  • Heritage Provider Network
  • Impetus
  • Jaspersoft
  • Karmasphere
  • LinkedIn
  • MarkLogic
  • Pentaho
  • Pervasive
  • Revolution Analytics
  • Splunk
  • Urban Mapping
  • Wolfram|Alpha
  • Esri
  • ParAccel
  • Tableau Software

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com

Download the Strata Sponsor/Exhibitor Prospectus

Contact Us

View a complete list of Strata Contacts