Schedule: Sponsored Sessions sessions

Add to your personal schedule
Jonathan Ellis
Location: Murray Hill Suite B
Jonathan Ellis (DataStax (formerly Riptano))

The Apache Cassandra database has added many new enterprise features this year based on the real-world needs of companies like Twitter, Netflix, Openwave, and others building massively scalable systems.

Apache Cassandra addresses a wide variety of real-time big data needs. Capable of tracking transactions in financial markets or the actions of millions of users in massively multiplayer games, Cassandra handles the demands of large volume applications and data streams. Whether it’s storing billions of emails or backing up terabytes of files, Cassandra can store large amounts of data and scale near-infinitely. In today’s information age, Cassandra excels at storing and serving massive amounts of data at low-latency – from geolocation data to server performance metrics, and more.

This talk will cover the motivation and use cases behind features such as secondary indexes, Hadoop integration, SQL support, bulk loading, and more.

Introduction

  • The shift to real time data driven applications and what that means * Why Cassandra is ideal for today’s enterprise data applications

Recap: Cassandra through 2010

  • Bulletproof reliability * Best-in-class support for multiple datacenters * High-performance storage engine based on Bigtable

New in Cassandra 1.0

  • Dynamic column indexes * Distributed counters for realtime analytics * CQL/SQL and JDBC support * Bulk loading * Off-heap allocation for GC performance * Hadoop support and Briskc

This session is sponsored by DataStax

Add to your personal schedule
Tasso Argyros
Location: Murray Hill Suite B
Tasso Argyros (Teradata Aster)

MapReduce, Hadoop, and other “NoSQL” big data approaches open opportunities for data scientists in every industry to develop new data-driven applications for digital marketing optimization and social network analysis through the power of iterative, big data analysis. But what about the business user or analyst? How can they unlock insights through standard business intelligence (BI) tools or SQL access? The challenge with emerging big data technologies is finding staff with the specialized skill sets of the data scientist to implement and use these solutions. Business leaders and enterprise architects struggle to understand, implement, and integrate these big data technologies with their existing business processes and IT investments and provide value to the business.

This session will explore a new class of analytic platforms and technologies such as SQL-MapReduce® which bring the science of data to the art of business. By fusing standard business intelligence and analytics with next-generation data processing techniques such as MapReduce, big data analysis is no longer just in the hands of the few data science or MapReduce specialists in an organization! You’ll learn how business users can easily access, explore, and iterate their analysis of big data to unlock deeper sights. See example applications with digital marketing optimization, fraud detection and prevention, social network and relationship analysis, and more.

This session is sponsored by Aster Data

Add to your personal schedule
Steve Jackson
Location: Murray Hill Suite B
Steve Jackson (Thomson Reuters)

How do you efficiently and effectively search the world’s leading collection of legal content — 2.2 billion documents — then quickly zero in on exactly what you need, all in a matter of seconds? Thomson Reuters Professional, built a Big Data information management architecture to do just this for their clients. WestlawNext gives legal professionals comprehensive, specialized content plus unique search technologies and tools that help them find, understand and apply the law and legal concepts in the service of their clients. Learn how Thomson Reuters manages and processes a variety of very large and diverse data sources to quickly publish timely, trusted, and relevant information to their clients. This session sponsored by Informatica

Add to your personal schedule
Lee Feinberg
Location: Murray Hill Suite B
Lee Feinberg (DecisionViz)

Sophisticated data analytics is a great thing. But great analytics is only valuable if people use it. The worst thing is a great analysis filled with answers sitting on the shelf going unused. In this session you will learn how to present and show analytics in highly compelling ways. You’ll learn how to use it as a cultural change-agent—and how you must shift to a “data marketing mindset” to make it all happen.

This session is sponsored by Tableau Software

Add to your personal schedule
Amr Awadallah
Location: Murray Hill Suite B
Amr Awadallah (Cloudera, Inc.)

The introduction of Apache Hadoop is changing the business intelligence data stack. In this presentation, Dr. Amr Awadallah, chief technology officer at Cloudera, will discuss how the architecture is evolving and the advanced capabilities it lends to solving key business challenges. Awadallah will illustrate how enterprises can leverage Hadoop to derive complete value from both unstructured and structured data, gaining the ability ask and get answers to previously un-addressable big questions. He will also explain how Hadoop and relational databases complement each other, enabling organizations to access the latent information in all their data under a variety of operational and economic constraints.

This session is sponsored by Cloudera

Add to your personal schedule
Jim Falgout
Location: Murray Hill Suite B
Jim Falgout (Pervasive Software)

Telecom network switches, network servers and other equipment generate and store large amounts of data every day. The data is mainly used for billing and network operations, If utilized fully, this data can have an enormous impact on network operations and overall profitability.

Many communications service providers (CSP) do not have the tools to mine this data quickly and deeply enough to realize its value. Tools are being used that are usually home grown and not scalable. Valuable information is being lost—information that could be used to predict network issues rather than respond to them after the fact. The alternative of a full analytic database can be cost-prohibitive.

By applying big data tools and predictive analytics upstream of the database, CSPs can move from reactive to proactive use of the data. Network quality problems can be identified in minutes rather than days. By analyzing all the data, analytics tools can pinpoint root cause and suggest corrective actions. Finding and fixing these issues more quickly leads to higher call quality, more profitable service and increased customer satisfaction.

This session is sponsored by Pervasive

Add to your personal schedule
Ron Avnur
Location: Murray Hill Suite B
Ron Avnur (MarkLogic), Mark Rodgers (LexisNexis)

Ron Avnur, SVP Engineering, MarkLogic, and Mark Rodgers, Sr. Director of Product Engineering, LexisNexis will reveal how LexisNexis is rebuilding its business platform to handle Big Data in real-time. LexisNexis is renowned for the technical solutions it has been building for 40+ years. It is well aware of the challenges of Big Data as it has gathered a huge amount of content. Avnur will explain how Big Data and unstructured information is slowly overtaking organizations. Rodgers will discuss the challenges LexisNexis faced as a global organization that was building new products to remain on the cutting edge of Big Data. Together, Avnur and Rodgers will give a brief overview of the technical implementation that enabled LexisNexis to address those challenges. Finally, Rodgers will detail the business benefits LexisNexis is experiencing as a result of its new Big Data business platform.

This session is sponsored by MarkLogic

Add to your personal schedule
Ted Dunning
Location: Murray Hill Suite B
Ted Dunning (MapR Technologies)

Map-reduce and Hadoop provide new scaling opportunities for analyzing data. As a result organizations are beginning to analyze and derive business value from large amounts of data that, in many cases, were previously simply being discarded. In some cases such as on-line advertising, the ability to analyze these previously impenetrable volumes of data have disrupted entire industries such as is the case with on-line advertising.

Such green field opportunities are rare, however, and few companies can afford to build an entirely new analytics pipeline. Integrating big data analytics systems like Apache Hadoop into existing analytics systems can be very difficult, however, because there are huge differences in the fundamental approaches being taken to the basic problems of how data should be accessed and analyzed.

These differences are exactly what makes these new technologies hugely effective, but they are also what makes integration between conventional and new approaches so difficult.

This talk will provide detailed descriptions of how to use new technologies to

  • Get data into and out of the Hadoop cluster as quickly as possible
  • Allow real-time components to easily access cluster data
  • Use well-known and understood standard tools to access cluster data
  • Make Hadoop easier to use and operate
  • Capitalize on existing code in map-reduce settings
  • Integrate map-reduce systems into existing analytic systems

These descriptions will be taken from real-life customer situations. Each will describe the problems faced and the solutions that solved these problems.

This session is sponsored by MapR Technologies

Add to your personal schedule
Steven Hillion
Location: Murray Hill Suite B
Steven Hillion (EMC DCD)

Do you use all the information you should when you make your most important decisions? Is your organization prepared to go beyond BI to enable breakthrough insights and decisions that transform the way you do business?

Increasingly organizations realize that data intensive predictive analytics is a necessary tool for a company to compete and succeed – even if the organization has already deployed a full-blown BI and DW stack. Armed with advanced analytics insights, business users can make well-informed decisions to support their organizations’ tactical and strategic goals – and create competitive advantage.

Steven Hillion, VP of EMC Greenplum’s Data Analytics Lab lends insight into emerging technologies to take advantage of the big data opportunity and how big data challenges today’s BI architectures and approaches to data management.

This session is sponsored by EMC Greenplum

Add to your personal schedule
Roger Magoulas
Location: Murray Hill Suite B
Moderated by:
Roger Magoulas (O'Reilly Media)
Panelists:
Anthony Goldbloom (Kaggle), Trajan Bayly (GE), Nuala O'Connor Kelly (GE), Abdul Shaikh (National Cancer Institute)

Panel Discussion on Assembling Data to Fight Breast Cancer

This session sponsored by GE

Add to your personal schedule
Vineet Tyagi
Location: Murray Hill Suite B
Vineet Tyagi (Impetus)

Businesses today are moving beyond the buzz and experimentation with batch processing options of Hadoop and MapReduce, stretching the limits for cutting edge performance & scalability. This session will talk about emerging trends of a new generation of NoHadoop (Not Only Hadoop) architectures for future proof big data scalability and prepare you for life beyond the elephant ride!

This session is sponsored by Impetus Technologies, Inc.

Sponsors

  • Aster Data
  • EMC Greenplum
  • GE
  • Lexis Nexis
  • MarkLogic
  • Tableau Software
  • Cloudera
  • DataStax
  • Informatica
  • DataSift
  • Splunk
  • Amazon Web Services
  • Datameer
  • Impetus
  • Karmasphere
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Sybase
  • Xeround
  • Media-Science
  • Platfora

Sponsorship Opportunities

For information on sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata Contacts

Speakers Video