Skip to main content

Office Hour

Add to your personal schedule
Table E
Moderated by:
Jack Norris (MapR Technologies)
Panelists:
Average rating: *....
(1.00, 1 rating)
After the keynotes end, stop by the 3rd floor O'Reilly area if you want to learn more about Hadoop's state in the industry, where it's headed, or just want to chat about MapR and the enterprise-ready M7 product with Jack Norris. Read more.
Add to your personal schedule
Table A
Claudia Perlich (Dstillery)
We are going to cover a number of challenges, opportunities and pitfalls appearing in predictive modeling projects. Read more.
Add to your personal schedule
Table B
I'll happily consult and suggest anything having to do with visualization, interface, interaction design, and design practice. • all facets of visualization design • information architecture & UX design • design process and thinking Read more.
Add to your personal schedule
Table C
John Akred (Silicon Valley Data Science)
Are you trying to navigate the changing technology landscape for your organization? Are you confused by the platforms and frameworks available? Don't know where to start with data strategy for your organization? Struggling to find the business value in big data technology investment? I'd be happy to help! Read more.
Add to your personal schedule
Table D
Bahman Bahmani (Stanford University)
During the office hour, depending on the attendees’ interests, we can go into more details on any of the following topics: * Tradeoffs governing MapReduce algorithms * Techniques and patterns for MapReduce algorithm design * How to use the introduced techniques for attendees' particular applications Read more.
Add to your personal schedule
Table A
Antonio Piccolboni (Per data LLC)
RHadoop is a collection of packages for the R developer working with Hadoop. rmr2, of which I am the main developer, is downloaded 600 times a month. I would like to connect with current and potential users to discuss problems, use cases, future directions and in general to gather feedback and provide information or help. Read more.
Add to your personal schedule
Table B
Eric Sammer (Cloudera)
During this hour, I'll answer questions about developing data-oriented applications on top of Hadoop and its ecosystem. Read more.
Add to your personal schedule
Table C
Philip Zeyliger (Cloudera)
* Debugging * Hadoop War Stories * Distributed Systems * Logs * Cheerful commentary Read more.
Add to your personal schedule
Table D
Alan Gates (Hortonworks), Owen O'Malley (HortonWorks), Arun Murthy (Hortonworks)
* Making Hive perform interactive queries: Tez, vectorization, and optimizer improvements * How ORC enables higher performance and more efficient storage * Growing Hive's SQL to be standard compliant: adding subquery, datatypes, updates, deletes, and ACID. Read more.
Add to your personal schedule
Table E
Leah Hanson (Google)
* The Julia programming language * Julia's introspection and metaprogramming capabilities Read more.
Add to your personal schedule
Table A
Dr. Vijay Srinivas Agneeswaran (Impetus Technologies), Pranay Tonpay (Impetus)
Average rating: ****.
(4.00, 1 rating)
Discussion on the talk on "Driving Data Decisions with Real-time Analytics". Gives opportunity for questions/further thoughts around the topic. Read more.
Add to your personal schedule
Table B
Matthew Russell (Digital Reasoning)
* Practical advice for implementing minimal viable products and data science experiments fueled by popular social web APIs such as Twitter, Facebook, and LinkedIn * Extended discussion on any topic encountered during the "Mining Social Web APIs with IPython Notebook" tutorial * Clarification/discussion on any topic pertaining to data mining and social media Read more.
Add to your personal schedule
Table C
Jayant Shekhar (Cloudera Inc)
Building a Unified Data Management Platform with Hadoop using Impala, Solr, HBase, OpenTSDB and Flume. The Platform would handle: * Streaming and NRT Analytics * Intelligently querying data across different stores with a rich User Interface and Query Language * Monitoring, Alerts and Time Series * Building core features on top of it including Recommendations, Spam Detection etc. Read more.
Add to your personal schedule
Table D
Robert Grossman (Open Data Group)
.Please stop by to talk about: * Adversarial analytics * Analytic models over big data * Open source analytic ecosystems, such as the Python-based "Augustus" environment and the Predictive Model Markup Language standard * Open data and platforms supporting open data, such as the Open Science Data Cloud Read more.
Add to your personal schedule
Table A
Matt Harrison (FusionIO)
* Getting started with Python * When to use Python * Python tools for data analysis Read more.
Add to your personal schedule
Table B
Baron Schwartz (VividCortex Inc)
I'll be available for followups to my presentation, riffing on similar ideas, wild and zany topics, and whatever seems germane. My co-founder at VividCortex, Kyle Redinger, will also be available Read more.
Add to your personal schedule
Table C
Robert Johnson (Interana)
* real-time data analytics * scaling real-time systems * scaling and performance of distributed systems Read more.
Add to your personal schedule
Table D
Deborah Estrin (Cornell NYC Tech)
Small data: How can we access, process, fuse, synthesize and activate personal digital traces to fuel new apps and services for the individual in their many roles: consumer, patient, student, parent, commuter, etc. Read more.
Add to your personal schedule
Table E
M. C. Srivas (MapR Technologies, Inc)
In this Office Hour session, M.C. Srivas will further discuss how to double performance for MapReduce jobs, achieve high-speed data ingestion, and execute HBase apps ten times faster while maintaining consistently low latency. Read more.
Add to your personal schedule
Table A
Ron Bodkin (Think Big Analytics)
* Essential things to consider when building a big data architecture * Brainstorming datasets with the Big Data Doctor * What's the right first project for your company? * How to get executive sponsors * How to let the business benefit from Hadoop data * How to analyze customer data in Hadoop Read more.
Add to your personal schedule
Table B
Greg Rahn (Cloudera)
* General questions about Impala * General questions about Parquet's columnar file format * Best practices and performance considerations for Impala * Architectural decisions for Impala Read more.
Add to your personal schedule
Table C
Sumeet Singh (Yahoo!)
* What is it like to work on cloud platforms and Hadoop at Yahoo * How do cloud services intersect with big data platforms * Notable use cases and technology stack we have that puts us at the frontier of Hadoop scale Read more.
Add to your personal schedule
Table D
Stephen OSullivan (Silicon Valley Data Science)
Are you trying to navigate the changing technology landscape for your organization? Are you confused by platforms and frameworks available? Don't know where to start with data strategy for your organization? Struggling to find the business value in big data technology investment? I'd be happy to help! Read more.
Add to your personal schedule
Table E
Barry Livingston (Riot Games)
- Building and growing a viable and scalable Big Data function at a global company like Riot, leveraging technologies like Hadoop, Platfora, Honu, Tableau, Hive and Vertica to produce a customized pipeline that serves the needs of diverse internal teams Read more.
Add to your personal schedule
Table A
The session will show some graphical mistakes. At the office hour that follows: • You bring some graphs that you’ve drawn or want comments on. • I’ll point out some graphical mistakes, suggest changes, or maybe even complement the figure. Read more.
Add to your personal schedule
Table B
Haoyuan Li (UC Berkeley)
We will answer questions about various aspects of the Berkeley Data Analytics Stack (BDAS), including: * Spark: an open source cluster computing system that aims to make data analytics fast - both fast to run and fast to program. * Shark: a fast SQL query engine built on top of Spark that is compatible with Hive. * Tachyon: a high throughput, distributed in-memory storage system. Read more.
Add to your personal schedule
Table C
Adam Fuchs (Sqrrl)
Adam will be ready to answer your questions about Apache Accumulo, architecting real-time big data apps, cell-level security for big data, and scaling your big data solutions. Read more.
Add to your personal schedule
Table E
Fangjin Yang (Metamarkets), Nelson Ray (Metamarkets)
In our talk, we discussed strategies and algorithms that enable fast, approximate queries over large quantities of data. This is accomplished via substantial pre-computation at ETL, sketching data structures, and distributed computation with an analytic database (Druid, in our case). We'd love to engage with our audience on any of those topics. Read more.
Add to your personal schedule
Table A
Joseph Rickert (Revolution Analytics)
Let's spend some time discussing the big picture or delving into the details of running Revolution Analytics' Revolution R Enterprise on Hadoop. Discussion topics: * Using R with Hadoop * Running the statistical functions in Revolution Analytics' RevoScaleR package directly on Hadoop * A statisticians view of Hadoop Read more.
Add to your personal schedule
Table B
Paul Kent (SAS)
Modernizing your Analytics Platform? Lets talk about SAS on your Cluster… Q+A Happy to discuss your Point of View and/or answer your questions as you modernize your analytics platforms. Read more.
Add to your personal schedule
Table C
Wes McKinney (DataPad Inc.)
* Data analysis and visualization tools * Analytics and Data Science workflows/applications * Python, R, and other programming languages for working with data Read more.
Add to your personal schedule
Table D
Stephanie McReynolds (ClearStory Data)
* Speeding data-driven decisions across distributed teams * Examples of creating data context in your analysis by converging data from internal sources with external data from public, web or premium sources * Companies that are incorporating data recommendations and data harmonization to drive business users to use more data and uncover more insights Read more.
Add to your personal schedule
Table A
My focus is on Text Analytics at Scale. Please bring any questions you may have on this topic and I'll do my best to help answer them or guide you to additional information. Read more.
Add to your personal schedule
Table B
* Overview and high level architecture of Apache Sentry (incubating) * How Sentry can be used to enforce fine-grained role-based authorization for data and metadata accessed through Hive and Impala * Future direction for Sentry i.e., support for other component of the Hadoop ecosystem Read more.
Add to your personal schedule
Table D
Aaron Myers (Cloudera, Inc.)
* HDFS * Hadoop security * Hadoop development Read more.
Add to your personal schedule
Table A
Jim Kaskade (Infochimps)
* The Future of Big Data (Hint: Cloud) * Real-Time Analytics in the Enterprise * Open-Source Standards in Big Data Stacks Read more.
Add to your personal schedule
Table B
Stephen McDaniel (Freakalytics)
* The 3 eras of analytics * Supporting the accidental analyst in your organization * Experience using new data intelligence solutions like ClearStory Data Read more.
Add to your personal schedule
Table C
Foster Provost ( NYU | Stern )
Is bigger data really better? Predictive analytics with massive fine-grained data on consumer behavior. The best-selling O'Reilly book: Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking. Read more.
Add to your personal schedule
Table D
Richard Brath (Oculus), David Jonker (Oculus)
Getting the right visualization for big data requires balancing user tasks, data constraints, visual representations and interactions. For visualizing graphs and social networks, consider: * When the standard graph visualization *not* a good solution * Alternative graph visualizations and best practices * Effective visualization of big data Read more.
Add to your personal schedule
Table A
Ravi Devireddy (Visa Inc)
Ravi Devireddy will discuss the opportunities with hadoop & big-data in enabling data-driven enterprise, and present some use cases and applications. Read more.
Add to your personal schedule
Table B
Giorgia Lupi (Accurat)
* Combining different kind of datasets to compose visual narratives able to maintain the informative richness of the data but still making this richness more accessible and understandable, publishing compound and complex stories told through data visualizations. Read more.
Add to your personal schedule
Table C
Carlos Guestrin (GraphLab Inc.)
* Machine Learning * Graph Analytics and Databases * GraphLab Read more.
Add to your personal schedule
Table D
Srisatish Ambati (0xdata Inc)
* What recommendation engines are most useful for highly unbalanced datasets? * What do you do with missing features or N/As in your dataset? * What do you do with unbalanced datasets? * What graph and network algorithms are best for outlier detection? Read more.
Add to your personal schedule
Table E
Prakash Nanduri (Paxata)
Most organizations have to converge data from many different sources in order to get the "answer set" they need to start doing ad-hoc analysis. Read more.
Add to your personal schedule
Table A
James Stewart (Government Digital Service), James Abley (Government Digital Service)
Covering the UK Government's Digital Transformation and the role of data processes, tools and companies within that. Read more.
Add to your personal schedule
Table B
Jim Englert (Gilt)
I'm happy to discuss Gilt's use of Riak to help solve a variety of issues including disaster recovery and data availability across data centers. We can also talk about scala, the cake pattern, Gilt's take on a service oriented architecture, personalization infrastructure, or real time event processing. Read more.
Add to your personal schedule
Table C
Eddie Satterly (Splunk)
* Splunk offerings and use cases * Big Data Ecosystem * Apache Cassandra * Splunk Partnerships and value plays Read more.
Add to your personal schedule
Table D
Ulrich Rueckert (Datameer)
I will be available for questions about machine learning and data mining techniques on top of Hadoop. I'll be prepared to talk about the following subjects: * Semi-supervised learning (the topic of my talk) * Predictive and descriptive data mining with big data * Analytics with Datameer Read more.
Add to your personal schedule
Table E
RAVI HUBBLY (Lockheed Martin)
* Migration of Batch jobs from Mainframe to Hadoop * Enterprise Data Repository using Hadoop * Integrating Enterprise Data Warehouse with Hadoop Read more.

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts