Skip to main content

Schedule: Sponsored sessions

Track Hosts

Lisa Green is the Director at the Common Crawl Foundation where she oversees the foundation’s mission of building, maintaining and openly disseminating a comprehensive crawl of the web. Prior to joining Common Crawl, she was the Chief of Staff at Creative Commons. Lisa holds a PhD in physical chemistry from the University of California Berkeley, lives in San Francisco and is passionate about open systems.

Coco Krumme is a PhD student at MIT, where she's partnered with a major financial institution to study transaction data and human behavior.

Dan Roesch is the Managing Director of Roesch & Associates with over 25 years of experience in strategy development for large companies and startups.

Mark Madsen designs and builds analysis and decision support systems, and building data management and access infrastructure. Research focus these days is on analysis techniques, emerging technology and practices in analytics, BI, information management, user experience for data access & delivery applications. Mark speaks at a lot of conferences on anything data, with a bunch of history of science and technology mixed in.

Tony Tran is the founder and co-organizer of the San Francisco Bay Area Machine Learning Meetup Group. He is currently a Machine Learning consultant helping companies tackle big data problems. Previously, Tony worked as a Data Engineer at Bizo where he built large scale search services and analyzed advertising data to help guide strategic business decisions. He holds an MS in Computer Science from UC Irvine specializing in Machine Learning and Computer Vision and a BS in Computer Science from UC San Diego.

Sharon Wragg is Sr. Analytics Engineer at Dictionary.com, where she is busy staking and pruning branches of the analytics data architecture while falling in love with language. She previously spent 20 years in process engineering and the semiconductor manufacturing industry, and has only recently pivoted to the field of data science and software development. Sharon is still a passionate fan of Deming, the total quality movement, lean thinking, SPC, and all the great disciplines for achieving excellence that have come out of the world of manufacturing. She holds a BS of Mechanical Engineering from UC Berkeley, with an emphasis in heat transfer.

Add to your personal schedule
Mission City
Amr Awadallah (Cloudera, Inc.)
Average rating: **...
(2.95, 19 ratings)
In this talk Dr. Amr Awadallah will present the Enterprise Data Hub (EDH) as the new foundation for the modern information architecture. Built with Apache Hadoop at the core, the EDH is an extremely scalable, flexible, and fault-tolerant, data processing system designed to put data at the center of your business. Read more.
Add to your personal schedule
Mission City
John Schitka (SAP)
Average rating: **...
(2.35, 20 ratings)
Crowdsourcing can be an effective way to collect massive amounts of data to enable deeper analysis in many situations. Explore the foundational steps that can lead to successfully crowd sourcing data though the lenses of the International Barcode of Life and Technical University Munich (TUM) ProteomicsDB projects. SAP is proud to be involved with driving the success of both these projects. Read more.
Add to your personal schedule
Mission City
John Schroeder (MapR Technologies)
Average rating: ***..
(3.22, 23 ratings)
This five-minute keynote will provide a quick overview of some of the more surprising things Hadoop is capable of in 5 minutes or less. Read more.
Add to your personal schedule
Mission City
Quentin Clark (Microsoft)
Average rating: **...
(2.53, 19 ratings)
How does the world change when big data reaches a billion people? What happens when anyone, from farmers to criminal investigators, gains the power to quickly derive meaningful insights from vast and varied data sources? Join Quentin Clark, Microsoft Corporate Vice President, who will highlight how simple, familiar tools and cutting-edge cloud technologies are bringing big data to all. Read more.
Add to your personal schedule
Ballroom H
Mike Gualtieri (Forrester Research)
Average rating: ***..
(3.67, 3 ratings)
Mike Gualtieri, principal analyst at Forrester Research, Inc., will facilitate a panel of production Hadoop users – including Cisco, The Climate Corporation, The Rubicon Project, and Solutionary – to discuss the challenges and best practices for deploying Hadoop in production. Join us for an engaging conversation on tips and tricks in deploying Hadoop in production. Read more.
Add to your personal schedule
Ballroom G
Eli Collins (Cloudera)
Average rating: *....
(1.50, 2 ratings)
In this talk, we'll explore how Apache Hadoop has rapidly evolved to become the new foundation for enterprise analytics - the enterprise data hub - and learn about the state-of-the-art in deploying a modern data warehouse on top of the Hadoop stack. Read more.
Add to your personal schedule
Ballroom F
Bryan Hurd (Microsoft Cybercrime Center), Herain Oberoi (Microsoft)
Average rating: **...
(2.50, 2 ratings)
BotNets and cybercrime are by their very nature Big Data problems. The Microsoft Cybercrime Center is working in conjunction with law enforcement, public sector, commercial and academic partners to investigate, disable and prosecute cyber criminals... Read more.
Add to your personal schedule
Ballroom F
Yuvaraj Athur Raghuvir (SAP Labs LLC.)
To seize the future data must be harnessed in actionable time. Based on a real deployment see to achieve instant results with infinite storage - filter large amounts of cold data in Hadoop, analyze in Real-Time with SAP HANA and visualize using SAP Lumira. Learn how solutions from SAP and our Hadoop partners can help your organization seize the future and gain unprecedented insight from Big Data. Read more.
Add to your personal schedule
Ballroom H
Ben Werther (Platfora), Sanjay Mathur (Silicon Valley Data Science)
Average rating: ***..
(3.67, 3 ratings)
Join us as we discuss the real-world applications of big data, examine what's working and what isn't, and discuss why you don't need to boil the big data ocean with Hadoop. Read more.
Add to your personal schedule
Ballroom G
Bill Franks (Teradata Corporation)
Average rating: ***..
(3.50, 6 ratings)
Attend this session to learn how you can take advantage of the new economics of data. This session will present examples of how leading organizations are evolving their enterprise data architectures to bring together the Data Warehouse, Hadoop & Data Discovery Platforms so All Users can benefit from ALL Analytics on ALL Data. Read more.
Add to your personal schedule
Ballroom F
Harold Hannon (SoftLayer)
The cloud provides an easy onramp to building and deploying Big Data solutions, particularly the latest technologies that favor scale-out architectures. Transitioning from initial deployment to a large-scale, highly performant operation without breaking the bank may not be easy. Read more.
Add to your personal schedule
Ballroom G
Justin Makeig (MarkLogic)
Average rating: ***..
(3.00, 1 rating)
Most data centers are filled with rigid data servers that are tightly linked to specific applications, leading to data duplication, lengthy development cycles, and unnecessary costs. Learn how you can use the MarkLogic Enterprise NoSQL database platform to help create a flexible, agile data fabric that will allow you to iterate your application development, optimize your data, and reduce costs. Read more.
Add to your personal schedule
Ballroom H
Damon Cool (Evernote), John Santaferraro (Actian Corporation )
Average rating: **...
(2.33, 9 ratings)
In 2012, Evernote took proactive steps to prepare for a rapidly expanding customer base by making the transition from 18-hour queries on a MySQL server to ad hoc analytics for 200 million daily events—while on a budget. This session explains how Evernote is scaling to hundreds of terabytes and analyzes 200 million events per day using two-tier architecture including Hadoop and analytic platform. Read more.
Add to your personal schedule
Ballroom H
Bruno Aziza (Alpine Data Labs)
Average rating: *....
(1.00, 2 ratings)
In this panel discussion, we’ll hear from entertainment, healthcare, and media industry leaders as they discuss their strategy to demystify analytics end to end. We’ll have a question and answer session moderated by Alpine Data Labs. Read more.
Add to your personal schedule
Ballroom G
Ben Redman (Citus Data)
Average rating: *****
(5.00, 2 ratings)
PostgreSQL is an advanced open source database known for its reliability. It also features a rich extension ecosystem that enables features like semi-structured data types, new SQL operators, and a columnar data store. This talk examines extensions available to PostgreSQL users and how CitusDB turns PostgreSQL into a scalable data platform for addressing real world analytics problems. Read more.
Add to your personal schedule
Ballroom F
Average rating: ****.
(4.00, 3 ratings)
The real promise of big data isn’t about merely doing analytics cost-effectively and at scale; it’s about discovery. Data discovery means uncovering hidden patterns from disparate sources without needing to know which questions to ask or the data relationships in advance... Read more.
Add to your personal schedule
Ballroom H
Sanjay Goil (Autonomy IDOL)
Average rating: *****
(5.00, 1 rating)
Forget the 140 characters, Twitter is Big Data. Every day sees around 100TBs of data ingested and tens of thousands of Hadoop jobs. Join us to hear how Twitter is using HP’s HAVEn platform to run their Big Data analytics. Learn why they’ve integrated HP Vertica with their Hadoop infrastructure to deliver the scale and speed needed for their analytics. Read more.
Add to your personal schedule
Ballroom G
Mohit Sati (Ask.com)
Average rating: ****.
(4.00, 1 rating)
Search Engine Marketing is an important revenue opportunity for Ask.com, planed to nearly double in 2014. Fueled by growth and acquisitions such as About.com and Investopedia, the keyword portfolio will grow by 90x through 2014. SEM Analytics at Ask.com involves tens of millions of cost metrics stored daily, hundreds of millions of portfolio keywords, and billions of historical costs. Read more.
Add to your personal schedule
Ballroom F
Mike Wendt (Accenture Technology Labs)
Average rating: ****.
(4.50, 2 ratings)
In this session, we will share the results of our study, a price-performance comparison of a bare-metal Hadoop cluster and cloud-based Hadoop clusters. Read more.
Add to your personal schedule
Ballroom F
Joe Hellerstein (Trifacta and UC Berkeley)
Average rating: ****.
(4.00, 1 rating)
Join Trifacta's founders and their customers to learn how Data Transformation is changing the way people work with data. By increasing data analyst productivity and giving business analysts direct access to Big Data for the first time, Trifacta's technology increases the breadth of data they work with, significantly shortens "time to insight", and enables better business decisions. Read more.
Add to your personal schedule
Ballroom H
Milan Vaclavik (CenturyLink Technology Solutions)
We will discuss the strategic significance of infrastructure core services (compute, storage, network, and comprehensive security) required for robust big data solutions. Also the strategic significance of Hadoop 2.0, Hadoop/NoSQL convergence, and the critical need for effective modeling, query formulation, and data analysis capabilities as Hadoop becomes an enterprise platform for big data. Read more.
Add to your personal schedule
Ballroom G
Average rating: ***..
(3.00, 2 ratings)
This presentation discusses how we used complex event processing (CEP) and MapReduce based technologies to track and process data from a soccer match as part of the annual DEBS event processing challenge while achieving throughput in excess of 100,000 events/sec. Read more.
Add to your personal schedule
Mission City
Kaushik Das (Pivotal)
Average rating: **...
(2.47, 19 ratings)
The emerging Internet Of Things (IOT) enables us to build smart systems. We already have the sensory and motor parts of these systems available, but we don't have the brain. This is where data science comes into the picture! I will talk about how we are using big data technologies in conjunction with data science here at Pivotal to build the digital brain that makes a system smart. Read more.
Add to your personal schedule
Mission City
Boyd Davis (Intel)
Average rating: ***..
(3.06, 17 ratings)
At Intel, we envision a future in which every organization in the world can use new sources of data to enhance its operational intelligence, fostering discoveries and innovation in science, industry, and medicine. Read more.
Add to your personal schedule
Mission City
Average rating: *....
(1.65, 17 ratings)
Big Data without analytics is just data, but how do you perform the analytics? In this session, learn how In-Hadoop analytics is changing the game for the possibilities of Hadoop. Read more.
Add to your personal schedule
Ballroom F
The Inflection Point - Hadoop and Big Data Analytics Read more.
Add to your personal schedule
Ballroom H
Vin Sharma (Intel)
In this session, I will illustrate these architectures with real-world examples of city governments, retail banks, food manufacturers, pharmaceutical companies, and Intel itself applying intelligence wherever data lives. Read more.
Add to your personal schedule
Ballroom G
Moderated by:
Jeffrey Kelly (The Wikibon Project)
Panelists:
Average rating: ***..
(3.60, 5 ratings)
Organizations are now moving beyond rigid and high latency data warehouse environments to more flexible and cost-effective "Data Lake(s)": centrally managed repository using low cost technologies such as Hadoop, SQL, In-Memory, and others to land any and all data that might potentially be valuable for analysis and operationalizing that insight. Read more.
Add to your personal schedule
Ballroom H
Patrick Shumate (Comcast Cable)
Average rating: ***..
(3.40, 5 ratings)
How Comcast Turns Big Data into Real-Time Operational Insights Read more.
Add to your personal schedule
Ballroom G
Rob Rosen (Pentaho), Tim Garnto (edo)
edo Interactive shares how they drive agile, improved decision-making by complementing native Hadoop technologies with analytical databases and ETL optimization and data visualization solutions from vendors such as Pentaho. Read more.
Add to your personal schedule
Ballroom F
Anand Venugopal (Impetus Technologies Inc.), Pranay Tonpay (Impetus)
This session will address the exciting possibilities of bringing dramatic improvements in various industry verticals using big data analytics especially real-time analytics over high-volume data in motion. Read more.
Add to your personal schedule
Ballroom H
Eric Frenkiel (MemSQL)
Average rating: *****
(5.00, 1 rating)
In this session, MemSQL CEO Eric Frenkiel will discuss the benefits for companies that augment their existing information architecture with a versatile real-time database platform to handle high volume and velocity transactional and analytical workloads. Read more.
Add to your personal schedule
Ballroom G
Wayne Thompson (SAS), Paul Kent (SAS)
In the world of ever growing data volumes, how do you extract insight, trends and meaning from all that data in Hadoop? Getting relevant information in seconds (instead of hours or days) from big data requires a different approach. Join Paul Kent and Wayne Thompson from SAS as they share how to reveal insights in your Big data and redefine how your organization solves complex problems. Read more.
Add to your personal schedule
Ballroom F
Owen O'Malley (HortonWorks), Alan Gates (Hortonworks)
Average rating: ****.
(4.00, 2 ratings)
Apache Hive is the de-facto standard for SQL-in-Hadoop today, with more enterprises relying on this open source project than on any alternative. Enterprises have asked for Hive to become more real-time and interactive‚ and the Hive community has responded. Read more.
Add to your personal schedule
Ballroom H
Peter Sirota (Amazon Web Services)
Average rating: ***..
(3.00, 4 ratings)
Learn from the Amazon Elastic MapReduce team's recent experience with streaming services such as Amazon Kinesis and low-latency query engines like Impala and Phoenix. We'll clarify many of the implementation details of our Hadoop InputFormat for Amazon Kinesis and demonstrate the power and flexibility of applying existing Hadoop ecosystem technologies to the real-time data paradigm. Read more.
Add to your personal schedule
Ballroom F
Jagane Sundar (WANdisco)
Average rating: ****.
(4.50, 2 ratings)
Application of the Paxos Protocol Towards Building a Continuously Available HBase Read more.
Add to your personal schedule
Ballroom G
Nenshad Bardoliwalla (Paxata, Inc.)
Join Paxata’s Nenshad Bardoliwalla for a look at the new breed of data preparation tools that use semantic algorithms to detect data types, apply machine learning to find hidden patterns, and link related columns of data automatically. Read more.
Add to your personal schedule
Ballroom H
Matt Quinn (TIBCO Software, Inc.)
Big Data is really a small data mindset. At the enterprise-level, where the potential for data collection is greatest, companies are still stuck compartmentalizing data. TIBCO CTO Matt Quinn will share how the world’s leading sports teams, airlines, banks and retailers are those that change their Big Data mindset to an All Data one. Read more.
Add to your personal schedule
Ballroom G
J.R. Arredondo (Rackspace)
We will discuss Rackspace’s vision for Data-as-a-Service, and provide a few key questions that could help you complement your technical analysis when choosing a database service. Along the way, we will also discuss parts of the portfolio of data services available at Rackspace, including SQL, MongoDB, Redis and Hadoop-based solutions. Read more.
Add to your personal schedule
Ballroom F
Raj Bains (Clustrix, Inc.)
NewSQL has followed quickly on the heels of NoSQL - providing scale-out of NoSQL along with SQL and ACID guarantees. We'll discuss NewSQL with customer examples and contrast it with SQL on Hadoop implementations. Read more.