All confirmed speakers for Strata 2014 are listed below. New speakers are added continuously, so please check back to see the latest updates to the program.
An engineering leader, entrepreneur and investor, Mike led the building of innovative, high-performance applications and services at Twitter, Palm and Microsoft before he joined KPCB. Mike is also an expert in “big data” businesses, having been the founder of Composite Software (acquired by Cisco). Formerly the vice president of engineering at Twitter, Mike led the team to rebuild and solidify Twitter’s infrastructure,... Read More.
Brian’s a statistician, journalist, and hacker. He lives in New York and serves as a 2013 Mozilla-Knight OpenNews fellow at the New York Times. Before that he was data scientist at the Harmony Institute. He recently graduated with a MA in Applied Statistics from Columbia University where he focused on quantitative and computational approaches to social science. On the side, he edited a book for a prominent political scientist, particiapted in hackathons, and worked on investigative news stories. Brian has managed development projects in sub-Saharan Africa, reached number one on the Hype Machine, and shared a stage with Spoon and Bob Dylan.
Soam Acharya is Head of Application Architecture at Altiscale, Inc, a company focused on building the world’s best Hadoop clusters. Previously, he was Chief Scientist and Architect of the Limelight Video Platform where he focussed on video analytics, hybrid cloud architectures and big data issues. Prior to acquisition by Limelight, he led Delve Network’s initiatives into semantic video, video search, analytics and cloud computing in AWS. In addition, Soam has also worked on web and enterprise search at Inktomi, LookSmart and Yahoo.
Joseph Adler has many years of experience in data mining and data analysis at companies including DoubleClick, American Express, and VeriSign. He graduated from MIT with an B.Sc. and M.Eng in Computer Science and Electrical Engineering. He is the inventor of several patents for computer security and cryptography, and the author of “Baseball Hacks” and “R in a Nutshell”. Currently, he is a senior data scientist at LinkedIn.
Sameer is a PhD student in the AMPLab at UC Berkeley. He actively collaborated with Microsoft Researchers on RoPE, an optimizer for parallel executions that has been successfully deployed on production clusters at Microsoft Bing. He completed his undergraduate education in the Department of Computer Science and Engineering at the Indian Institute of Technology, Guwahati in 2009 and was awarded the prestigious President of India Gold Medal.
With over 15 years in advanced analytical applications and architecture, John is dedicated to helping organizations become more data-driven. He combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.
Eva Andreasson has been working with Java virtual machine technologies, SOA, Cloud, and other enterprise middleware solutions for the past 10 years. Joined the startup Appeal Virtual Machines in 2001, as a developer of the JRockit JVM, which later was acquired by BEA Systems. Eva has been awarded two patents on Garbage Collection heuristics and algorithms. She also pioneered Deterministic Garbage Collection which later became productized through JRockit Real Time. Eva has worked closely with Sun and Intel on many technical partnerships, as well as various integration projects of JRockit Product Group, WebLogic, and Coherence (post the Oracle acquisition in 2008). After two years as the product manager for Zing, the worlds most pauseless JVM, at Azul Systems, she joined Cloudera... Read More.
David Andrzejewski is Lead Data Sciences Engineer at Sumo Logic, which he joined in 2011 after a postdoctoral research position working on knowledge discovery at Lawrence Livermore National Laboratory (LLNL). He is interested in developing tools that combine the power of machine learning with human insights, and has published work applying these ideas to problems in biomedical text mining, information retrieval and software behavior. David completed his PhD in Computer Sciences at the University of Wisconsin-Madison in 2010, where he had also previously received an M.S. in Computer Sciences and a B.S. in Computer Engineering, Mathematics and Computer Sciences.
J.R.joined Rackspace in 2012 and currently drives Product Marketing for Data and Platform Services. J.R. came to Rackspace from Microsoft where he helped drive Product Management and Engineering for the app developer platforms and data access services in Office and SharePoint from 2006 to 2012. Before Microsoft, he was Director of Corporate Development at BMC Software driving Strategy and M&A initiatives in the areas of Application and Database Performance Management. Prior to BMC Software he was a lead software architect and engineering manager for eCommerce at Compaq.
Mat is currently a Ph.D. student at Princeton University under the supervision of Michael J. Freedman. Previously, Mat completed his undergraduate degree in general engineering at The Cooper Union in 2005. His research interests are in building scalable distributed systems for analyzing high-velocity data, programmability, security, and networking.
Yuvaraj Athur Raghuvir, is a Senior Director with SAP HANA Platform for Big Data leading the GTM and Platform Evolution portfolios. Yuvaraj has over 15 years of experience in Enterprise Software and Enterprise Architecture spanning Business Analytic Solutions, CRM and Enterprise Applications.
Amr is Co-Founder and CTO of Cloudera where he is responsible for all engineering efforts from product development to release, for both the open source projects and Cloudera’s proprietary software. Prior to Cloudera Amr served as Vice President of Engineering at Yahoo!, and led a team that used Hadoop extensively for data analysis and business intelligence across the Yahoo! online services. Amr holds a Bachelor’s and Master’s degrees in Electrical Engineering from Cairo University, Egypt, and a Doctorate in Electrical Engineering from Stanford University.
Bruno Aziza is a Big Data entrepreneur and author. He’s lead Marketing at multiple start-ups and has worked at Microsoft MSFT -1.38%, Apple and BusinessObjects/SAP . One of his startups sold to Symantec in 2008 and two of them have raised tens of millions and experienced triple digit growth. Bruno is currently Chief Marketing Officer at Alpine Data Labs, loves soccer and has lived in France, Germany and the U.K. You can contact him @email@example.com
Raj Bains is Director of Products at Clustrix. He works on Product Strategy and Database Engineering. He has insight into database trends and the underlying technology. Before Clustrix, he led the team and co-architected Financial Contract Definition Language, VM, Compiler and Web Services at RMS. His background includes High Performance Computing, Languages and Compilers at NVIDIA and Microsoft.
Magdalena Balazinska is an Associate Professor in the department
of Computer Science and Engineering at the University of
Washington. Magdalena’s research interests are in the field of
database management systems. Her current research focuses on big data
management, sensor and scientific data management, and cloud
computing. Magdalena holds a Ph.D. from the Massachusetts Institute of
Technology (2006). She is a Microsoft Research New Faculty Fellow
(2007), received an NSF CAREER Award (2009), a 10-year most
influential paper award (2010), an HP Labs Research Innovation Award
(2009 and 2010), a Rogel Faculty Support Award (2006), a Microsoft
Research Graduate Fellowship (2003-2005), and multiple best-paper
An executive and thought leader with a proven track record of success leading product strategy, product management, and development in business analytics. Bardoliwalla co-founded Tidemark Systems, Inc. where he drove the market, product, and technology efforts for their next-generation analytic applications built for the cloud. He formerly served as VP for product management, product development, and technology at SAP where he helped to craft the business analytics vision, strategy, and roadmap leading to the acquisitions of Pilot Software, OutlookSoft, and Business Objects. Prior to SAP, he helped launch Hyperion System 9 while at Hyperion Solutions. Nenshad began his career at Siebel Systems working on Siebel Analytics. Nenshad is also the lead author of Driven to Perform: Risk-Aware Performance Management From Strategy Through Execution.... Read More.
Brian Behlendorf is Managing Director at Mithril Capital Management in San Francisco. His career has been a mix of technology start-up, public policy, and non-profit tech leadership. Brian serves on the Boards of the Mozilla Foundation, the Electronic Frontier Foundation, and Benetech, three organizations using technology to fight for civil liberties, open technologies, and social impact in the digital domain. Prior to Mithril, Brian was Chief Technology Officer at the World Economic Forum. He also served for two years at the White House as advisor to the Open Government project within the Office of Science and Technology Policy, and then later as advisor to Health and Human Services on open software approaches to health information sharing. Before that he has founded two tech companies (CollabNet... Read More.
Anjul Bhambhri is the Vice President of Big Data Products at IBM. She was previously the Director of IBM Optim application and data life cycle management tools. She is a seasoned professional with over twenty-two years in the database industry. Over this time, Anjul has held various engineering and management positions at IBM, Informix and Sybase. Prior to her assignment in tools, Anjul spearheaded the development of XML capabilities in IBM’s DB2 database server. She is a recipient of the YWCA of Silicon Valley’s “Tribute to Women in Technology” award for 2009. Anjul holds a degree in Electrical Engineering.
Lukas Biewald is the founder and CEO of CrowdFlower. Founded in 2007, CrowdFlower provides Labor-on-Demand to help companies outsource high-volume, repetitive tasks to a massively-distributed global workforce.
Before founding CrowdFlower, Lukas was a senior scientist and manager within the Ranking and Management Team at Powerset, Inc., acquired by Microsoft in 2008. He led the Search Relevance Team for Yahoo! Japan after graduating from Stanford University with a B.S. in Mathematics and an M.S. in Computer Science. Recently, Lukas won the Netexplorateur Award for GiveWork – a collaboration with Samasource that brings digital work to refugees worldwide. Lukas is also an expert level Go player.
Beth Blauer is the Director of GovStat for Socrata. A well-known proponent of open government, data transparency and utilization, Blauer developed and implemented a statewide performance management program called StateStat for Maryland in the office of Governor Martin O’Malley that is credited with improving across-the-board outcomes in all areas, especially education, health, public safety, and the environment. Blauer earned a bachelor’s degree from the University of Maryland College Park and her J.D. at New York Law School, where she was a Public Interest Fellow and served as the President of the Legal Association for Women.
Joshua Bloom is CEO and co-founder of wise.io. He is an astronomy professor at the University of California, Berkeley where he teaches astrophysics and Python for data science. He has been a Sloan Fellow, Junior Fellow at the Harvard Society, and Hertz Foundation Fellow. In 2010, he was awarded the Pierce Prize from the American Astronomical Society. He has published over 250 refereed academic articles. Josh holds a PhD from Caltech and degrees from Harvard and Cambridge. He serves on the Berkeley Startup Cluster Advisory Committee.
Farrah created The Difference Engine based on the belief that deep understanding of customer needs is essential to growing businesses through great products and services.
She has honed her customer-centric insights as an advisor to some of the world’s best-respected brands, including Apple, Microsoft, Disney, Samsung and UPS. She began her career as a creative, and then went on to be a strategist at leading agencies, including Wieden + Kennedy, TBWA\Chiat\Day, Mad Dogs & Englishmen and Digitas, where she was Group Planning Director and mobile strategy lead. She also ran innovation as a partner at Hall & Partners, and developed digital tools for online qualitative research as SVP, Consumer Immersion at OTX.
Oscar Boykin is a staff data scientist at Twitter, co-author of Algebird, Scalding and Summingbird.
Henrik Brink is CTO and co-founder of wise.io. He has worked in industry and academia designing and implementing large-scale distributed software projects used in production around the globe. He has previously founded and managed a consultancy business focused on scalable backend systems and modern web technologies. Henrik has a Physics degree from University of Copenhagen and has worked as a researcher and data scientist at UC Berkeley. He has published papers in the intersection of software development and machine learning on real-world messy data, and has a good handle on the latest developments in software development, machine learning and distributed architectures.
Kurt leads the Data Platform team at Netflix. His group architects and manages the technical infrastructure that enables Netflix’s data-centric decision making. The Netflix data platform includes both traditional BI tools (e.g. Teradata and MicroStrategy) and various Big Data technologies (e.g. Hadoop, Hive, and Pig).
James Burke has been called “One of the most intriguing minds in the Western world” (Washington Post). His audience is global. His influence in the field of the public understanding of science and technology is acknowledged in citations by such authoritative sources as the Smithsonian and Microsoft CEO Bill Gates. His work is on the curriculum of universities and schools across the United States.
In 1965 James Burke began work with BBC-TV on Tomorrow’s World and went on to become the BBC’s chief reporter on the Apollo Moon missions. For over forty years he has produced, directed, written and presented award-winning television series on the BBC, PBS, Discovery Channel and The Learning Channel. These include historical series, such as Connections (aired... Read More.
David Chaiken is a member of the technical staff at Altiscale, a start-up in Palo Alto that runs Hadoop as a service for other companies.
Before Altiscale, David served as the Chief Architect of Yahoo!, where he led teams building consumer advertising and media systems with Hadoop at their core. Over his career, David has also built voice search products for consumers, mobile enterprise applications, network management systems, project management software, a large-scale multiprocessor architecture, a tablet computer and several other information appliances.
David is an experienced speaker at industry conferences including keynoting the Saturn Conference and presenting at the ACM Chennai Chapter Lectures.
David’s favorite technologies include the RSA encryption algorithm, the C programming language, the ARM instruction set architecture, the... Read More.
Diane Chang is a Senior Data Scientist at Intuit. During her tenure at Intuit, Diane has worked with the Consumer Group to perform in-depth behavioral analysis of the TurboTax online customers, and has explored the business impact of both online advertising spend and utilization of customer care services. More recently she has been involved in an embedding pilot where she joined the QuickBooks Financing team to use data to improve the likelihood of a QuickBooks small business obtaining financing.
Diane has a PhD in Operations Research from Stanford and has worked for a small “mathematical” consulting firm, and a start-up in the online advertising space. Prior to joining Intuit Diane was a stay-at-home mom for 6 years.
Scott Chastain, Engineering Manager, Information Management and Delivery, SAS
Scott empowers the SAS Americas sales and technical groups with the architecture, strategy and implementation of SAS’ business analytics infrastructure. He has direct responsibilities for information management, visualization and business intelligence.
To be added later
Avery has a PhD from Northwestern University in the area of parallel computing. He worked at Yahoo! Search for four years on the web map analytics platform, large-scale ad hoc serving infrastructure, and cluster management. During the past year and a half, he has been working at Facebook in the general area of big data computational frameworks (Corona – scalable MapReduce and Giraph – scalable graph processing).
As corporate vice president of program management for the Microsoft Data Platform Group, Quentin Clark oversees the design and delivery of the entire family of SQL Server products as well as the Azure Data Platform services. The Azure Data Platform is a complete end-to-end platform serving data management and processing capability, data integration and refinement, and business analytics as Microsoft Azure services and Microsoft Office and Office 365 offerings. Leading a team of technical engineers, his responsibilities include product direction and definition through program management, user experience and design, and customer engagement programs. This spans SQL Server’s work in all workloads – databases, integration and business intelligence, as well as the release forms of the product – software, appliances and the cloud services.... Read More.
Adrian Cockcroft has had a long career working at the leading edge of technology. He’s always been fascinated by what comes next, and he writes and speaks extensively on a range of subjects. At Battery, he advises the firm and its portfolio companies about technology issues and also assists with deal sourcing and due diligence.
Before joining Battery, Adrian helped lead Netflix’s migration to a large scale, highly available public-cloud architecture and the open sourcing of the cloud-native NetflixOSS platform. Prior to that at Netflix he managed a team working on personalization algorithms and service-oriented refactoring.
Adrian was a founding member of eBay Research Labs, developing advanced mobile applications and even building his own homebrew phone, years before iPhone and Android launched. As a distinguished... Read More.
Eli is a lead on Cloudera’s Platform team, an active contributor to Apache Hadoop and member of its project management committee (PMC) at the Apache Software Foundation. Prior to Cloudera he worked on the ESX Hypervisor at VMware. Eli holds Bachelor’s and Master’s degrees in Computer Science from New York University and the University of Wisconsin-Madison, respectively.
A Senior Data Scientist at LinkedIn, Michael Conover develops machine learning infrastructure that leverages the relationships and behavior of hundreds of millions of individuals. His academic research on propaganda campaigns and political polarization has been featured in The Wall Street Journal, Science, the MIT Technology Review, and on National Public Radio.
Drew Conway is an expert in the application of computational methods to social and behavioral problems at large-scale. Drew has been writing and speaking about the role of data — and the discipline of data science — in industry, government, and academia for several years. Drew has advised and consulted companies across many industries; ranging from fledgling start-ups to Fortune 100 companies, as well as academic institutions and federal agencies. Drew is a co-founder of DataKind (non-profit connecting social organizations with data scientist), the author of Machine Learning for Hackers (O’Reilly Media, 2012), a co-chair of the DataGotham conference, and is currently serving as the Scientist-in-Residence at IA Ventures. Drew is also completing his doctoral work in the Department of Politics at New York University.... Read More.
Damon Cool, director of analytics at Evernote, leads the team that is responsible for building the infrastructure for storing and presentation of corporate data. He works with product managers and the executive team to develop reporting, visualization, and analytic solutions. He has nearly 20 years in reporting, data warehousing, and analytics. Prior to joining Evernote Damon was a Business Intelligence Manager in consumer e-commerce at Symantec.
Alistair has been an entrepreneur, author, and public speaker for nearly 20 years. He’s worked on a variety of topics, from web performance, to big data, to cloud computing, to startups, in that time. In 2001, he co-founded web performance startup Coradiant (acquired by BMC in 2011), and since that time has also launched Rednod, CloudOps, Bitcurrent, Year One Labs, the Bitnorth conference, the International Startup Festival and several other early-stage companies.
Alistair is a chair for Strata + Hadoop World conferences; Techweb’s Cloud Connect; and the International Startup Festival. He’s written four books on analytics, technology, and entrepreneurship, including the best-selling Lean Analytics which is being translated into eight languages. He lives in Montreal, Canada and tries to mitigate chronic ADD... Read More.
Beau co-founded two startups based on probabilistic inference, the second of which was acquired by Salesforce in 2012. He knows works there as a product manager. He received his PhD in computational neuroscience from MIT in 2008.
I’m a creative strategist and product designer who believes that progress takes more than technology alone.
I’ve worked on the front lines of brand development with some of the world’s most forward-thinking creative agencies and have pursued research around identity politics and decision making in the mediated environment. I’m currently leading the experience design effort for ClipCard, which involves rethinking the way people and information connect in a big data world.
Tathagata Das is a third-year Ph.D. student in the AMP Lab in UC Berkeley, working Scott Shenker and Ion Stoica. He leads the development of the Spark Streaming project. His research interests include datacenter networks and frameworks for large scale data processing. Before graduate school, he has worked as an Assistant Researcher in Microsoft Research Lab India.
Kaushik Das is an expert at applying mathematical models to solve business problems. He has more than 10 years of experience designing and deploying analytical software, working for enterprise software companies such as Rapt, Demandtec and M-Factor as well as in business consulting with McKinsey. After joining EMC’s Data Computing Division as Director of Analytics, Kaushik has led several projects to provide actionable insights from the analysis of big data, for customers in a variety of sectors ranging from Utility, Oil & Gas, to Banking, and Digital Media. Kaushik is leveraging his Geophysics PhD-level academic work to lead our efforts to build out an analytics practice for the Energy sector, where Big Data holds great promise.
Prior to joining Amplify as a General Partner, Mike spent over six years at Battery Ventures, where he most recently lead early stage Enterprise investments on the West Coast. Most recently, Mike sat on the Boards of Continuuity, Duetto, Interana and Platfora. Mike also lead Battery’s investment in a stealth security company which is also in Amplify’s portfolio. Mike previously invested in Splunk and RelateIQ, which was recently acquired by Salesforce. He was named to Forbes’ Midas Brink List earlier this year. Mike is a frequent speaker at conferences and is on the Advisory Board of both the O’Reilly Strata Conference as well as SXSW. Mike began his career as a hardware engineer at a start-up and later held product, business development, and sales... Read More.
Renee DiResta is a Principal at O’Reilly AlphaTech Ventures (OATV), where she evaluates seed-stage investments. Prior to joining OATV in June of 2011, Renee spent seven as a trader at Jane Street Capital, a quantitative proprietary trading firm in New York City. She is interested in meeting interesting startups, data science, and improving liquidity and transparency in private markets.
Edd Dumbill is a technologist, writer and programmer based in California. He’s helping drive businesses with data as VP Strategy for Silicon Valley Data Science.
He was the founder and creator of the Expectnation conference management system, and a co-founder of the Pharmalicensing.com online intellectual property exchange.
A veteran of open source, Edd has contributed to various projects, such as Debian and GNOME, and created the DOAP Vocabulary for describing software projects.
Ted Dunning has been involved with a number of startups with the latest being MapR Technologies where he is Chief Application Architect working on advanced Hadoop-related technologies. He is also a PMC member for the Apache Zookeeper and Mahout projects. Opinionated about software and data-mining and passionate about open source, he is an active participant of Hadoop and related communities and loves helping projects get going with new technologies.
Emil is the founder of Neo4j, the most widely deployed graph database on the planet, CEO of its commercial sponsor Neo Technology and a co-author of the O’Reilly book Graph Databases. He plans to save the world with graphs and own Larry’s yacht by the end of the decade. He tweets at @emileifrem.
Elena Eneva is a Data Scientist at Accenture (before it was fashionable to be one) using and developing Data Science methods with a focus on Healthcare. Prior to Accenture, Elena worked at Yahoo! on Machine Learning for fraud detection and marketing. She did her graduate studies in Machine Learning from Carnegie Mellon University and got her B.A in Computer Science from University of the South: Sewanee.
Last summer, Elena took a sabbatical to be a mentor at the Data Science for Social Good Summer Fellowship at the University of Chicago. She led several teams of fellows, working on projects in healthcare (with Northshore Hospital) and disaster relief (with Ushahidi), and developed open source solutions for predicting cardiac arrests and analyzing crowdsourced data during disasters and... Read More.
Are stars like Usain Bolt, Michael Phelps, and Serena Williams genetic freaks put on Earth
to dominate their respective sports? Or are they simply normal people who overcame their
biological limits through sheer force of will and obsessive training? In the decade since the
sequencing of the human genome, researchers
have slowly begun to uncover how the
relationship between biological endowments and a competitor’s training environment affects
In this controversial and engaging exploration of athletic success, based on his best
selling “The Sports Gene: Inside the Science of Extraordinary Athletic Performance”,
David Epstein tackles the great nature vs. nurture debate and traces how far science has come in solving this
timeless riddle. He investigates... Read More.
Susan Etlinger is an industry analyst at Altimeter Group, where she works with global companies to develop both social data intelligence strategies that support their business objectives. Susan has a diverse background in marketing and strategic planning within both corporations and agencies. She’s a frequent speaker on social media and analytics and has been extensively quoted in outlets including Fast Company, BBC, New York Times and The Wall Street Journal. Find her on Twitter at @setlinger and at her blog, Thought Experiments, at susanetlinger.com.
Shelley Evenson is currently Fjord’s Executive Director of Organizational Evolution, leading Fjord’s initiatives to grow talent and advance innovative knowledge sharing practices across the company. She is also the co-founder of the International Service Design Network and works with Fjord’s global teams to deepen their skills and expertise by bringing groundbreaking service design practices to client projects. Previously, Shelley was a Research Manager in Design and User Experience at Facebook and a Principal User Experience Designer and Manager for Microsoft. She was also an Associate Professor in the School of Design at Carnegie Mellon University.
A contributor to several books and articles on service, interaction and design strategy, Evenson is a frequent speaker at events including the AIGA’s Design Conference, IA Summit, IIT’s Design Research... Read More.
Bob Filbin is Chief Data Scientist at Crisis Text Line, the first large-scale 24/7 national crisis line for teens on the medium they use and trust most: texting. Bob specializes in the application of behavioral psychology to questions of data collection, analysis, and reporting, to make sure data leads to good behavioral change. Bob has given lectures on using data to drive behavioral change at places including MIT, the University of Pennsylvania and the North American International Auto Show, and has authored several articles in the Harvard Business Review on data. He runs in Prospect Park.
Jake joined Accel in 2012. He has over a decade of experience building innovative software products. He focuses on early stage investments in next generation infrasture and data-driven services and is part of the team responsible for Accel’s Big Data Fund. Jake currently sits on the board of Origami Logic, provider of a visual Big Data analytics platform for marketers, Trifacta, creator of radical productivity software for data preparation and analysis, and Sumo Logic, a cloud-based log management and analytics solution. Prior to Accel, Jake was Director of Product Management at Splunk where he was responsible for Splunk’s user interface and Big Data strategy. Previously, he worked at Cloudera where he helped the founding team tackle a broad array of sales, marketing, and product issues.... Read More.
Neal Ford is Software Architect and Meme Wrangler at *Thought*Works, a global IT consultancy with an exclusive focus on end-to-end software development and delivery. He is also the designer and developer of applications, instructional materials, magazine articles, courseware, video/DVD presentations, and author and/or editor of 6 books spanning a variety of technologies, including the most recent The Productive Programmer. He focuses on designing and building of large-scale enterprise applications. He is also an internationally acclaimed speaker, speaking at over 100 developer conferences worldwide, delivering more than 600 talks. Check out his web site at www.nealford.com. He welcomes feedback and can be reached at firstname.lastname@example.org.
John Foreman is the Chief Data Scientist for MailChimp.com where he leads MailChimp’s data product development effort called the Email Genome Project. He also runs the Data Science for Managers course at Analytics Made Skeezy.
John holds a graduate degree in Operations Research from MIT and has worked as an analytics consultant for the Department of Defense, Coca-Cola, Royal Caribbean International, and Intercontinental Hotels Group. His expertise is in optimization modeling, revenue management, and predictive modeling.
Bill Franks is Chief Analytics Officer for Teradata, providing insight on trends in the analytics & big data space and helping clients understand how Teradata and its analytic partners can support their efforts.
In addition, Bill is a faculty member of the International Institute for Analytics and the author of the book Taming The Big Data Tidal Wave (John Wiley & Sons, Inc., April, 2012). He is also an active speaker and blogger.
Bill’s focus has always been to help translate complex analytics into terms that business users can understand and to then help an organization implement the results effectively within their processes. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit... Read More.
Eric Frenkiel co-founded MemSQL and has served as CEO since inception. Before MemSQL, Eric worked at Facebook on partnership development. He has worked in various engineering and sales engineering capacities at both consumer and enterprise startups. Eric is a graduate of Stanford University’s School of Engineering. In 2011 and 2012, Eric was named to Forbes’ 30 under 30 list of technology innovators.
Ben Fry received his doctoral degree from the Aesthetics + Computation Group at the MIT Media Laboratory, where his research focused on combining fields such as computer science, statistics, graphic design, and data visualization as a means for understanding information. After completing his dissertation in 2004, he spent time developing tools for visualization of genetic data as a postdoc with Eric Lander at the Eli & Edythe L. Broad Institute of MIT & Harvard. During the 2006-2007 school year, Ben was the Nierenberg Chair of Design for the Carnegie Mellon School of Design.
He is the author of Visualizing Data (O’Reilly, 2007) and the co-author, with Casey Reas, of Processing: A Programming Handbook for Visual Designers and Artists (MIT Press, 2007) and... Read More.
To be added later
Before joining Canaan, Ross was a partner at seed-stage technology investment firm Kapor Capital, where he led investments across consumer, enterprise, and health technology. He currently serves as an advisor to Kapor Capital, Palantir Technology, Facebook Causes, and other early stage technology companies.
Previously, Ross was a successful entrepreneur who co-founded and grew CubeTree, a Gartner Visionary enterprise social collaboration company which is used by the Fortune 100 including SAP, Intuit, and Houghton Mifflin Harcourt. CubeTree was acquired by SuccessFactors (NASDAQ:SFSF) in 2010 where he then served as a vice president.
Prior to that, Ross was... Read More.
Adam Fuchs is the Chief Technology Officer and co-founder of Sqrrl. Previously at the National Security Agency, Adam was an innovator and technical director for several database projects, handling some of the world’s largest and most diverse data sets. He is a co-founder of the Apache Accumulo project. Adam has a BS in Computer Science from the University of Washington and has completed extensive graduate-level course work at the University of Maryland. In his spare time, Adam enjoys racing sailboats, trail running, and getting lost in the woods.
Tim Garnto serves as Senior Vice President of Product Engineering at edo. Garnto is responsible for partnership integrations, product engineering, code promotion, and business intelligence platforms. Tim oversaw edo’s transition of its BI platform from a traditional relational database to using Cloudera Hadoop and Pentaho to coordinate data movement, jobs, as well as provide reporting. The edo product engineering team migrated all BI and analytic operations into Hadoop realizing drastic increase in scale/capacity while cutting processing times significantly.
Prior to joining edo, Tim set strategic direction for the development and delivery of marketing database products, first as Chief Technology Officer for SmartDM Holdings, Inc. and then as Director of Product Management for Acxiom Corporation. Tim also brings experience in business planning and product strategy from... Read More.
Co-founder of Prior Knowledge, now working on predictive analytics at Salesforce.
Front-end developer and HTML5 fan. Currently working as a Developer Programs Engineer with Knowledge team at Google. Deeply involved in Structured Data, Google Knowledge Graph and Custom Search.
Alan is a co-founder at Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan also designed HCatalog and guided its adoption as an Apache Incubator project. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a book from O’Reilly Press.
Matt Gee has 8 years of experience leading analytics teams in the social sector. At the US Treasury he led the team tasked with rebuilding the analytics platform for mortgage lending by big banks in the wake of the 2008 crisis. He was the founding director of the Center for Impact Measurement, and is a founding organizer of the Eric and Wendy Schmidt Data Science for Social Good Summer Fellowship. He has organized and lead analytics projects with government and nonprofit partners, including the Environmental Defense Fund, the Federal Trade Commission, the State of Illinois, the City’s of Chicago, San Francisco, and Memphis, and the Department of Energy. His work developing analytics training for nonprofits was recently awarded a grant from the MacArthur Foundation. He... Read More.
A software/systems engineer with a lot of experience building big, real-world systems.
Rayid Ghani is a Research Director and Senior Fellow at the Computation Institute and the Harris School of Public Policy at the University of Chicago. He is also the co-founder of Edgeflip, an analytics and social media startup that is focused on helping non-profits, advocacy groups, and charities do better fundraising, volunteer recruiting, outreach and advocacy. Previously, Rayid was the Chief Scientist for the Obama 2012 Election Campaign focusing on analytics, data, and technology.
Rayid is currently focused on using data, analytics (and other related buzzwords ) for social causes, both with Edgeflip and the University of Chicago. Rayid created and runs the Eric & Wendy Schmidt “Data Science for Social Good” Summer Fellowship which brings together aspiring data scientists to work on data science... Read More.
Ali Ghodsi is an Assistant Professor at KTH/Royal Institute of Technology in Sweden and a Visiting Researcher at UC Berkeley since 2009. His general interests are in the broader areas of distributed systems and networking. He received his PhD in 2006 from KTH/Royal Institute of Technology in the area of Distributed Computing.
Sanjay Goil heads product management for the HAVEn big data platform at HP. He also manages product strategy for the big data IDOL platform for text and rich media search-based analysis. Prior to HP Sanjay has built products in big data and high performance computing at Intel, Sun, and Bell Labs. Sanjay has an MBA from Berkeley and a PhD from Northwestern.
Abe Gong is a data scientist at Jawbone, where he builds data products to support the UP fitness tracker.
Joseph is currently a postdoc in the AMPLab at UC Berkeley and co-founder of GraphLab Inc. Joseph received his PhD from the Machine Learning Department at Carnegie Mellon University where he worked with Carlos Guestrin on parallel algorithms and abstractions for scalable probabilistic machine learning. He is a recipient of the AT&T Labs Graduate Fellowship and the NSF Graduate Research Fellowship.
Parag leads the Data Platform strategy at General Electric – Software Center and is based in the San Francisco Bay area. Formerly he was leading Big Data, Remote Services (RM&D) and technology strategy at GE Power & Water. He has over 17 years experience in software & manufacturing industries. Over this time Parag has held various SW engineering, IT and management positions driving integrated solutions, design & architecture, Big Data & Cloud strategy, Software Development, Agile, Digital Media & Mobile technologies and NPI. Parag holds a US patent on Wearable personal audio/video device. In 2012, his Remote Services (RM&D) platform received the CIO magazine’s top 100 award. He is accredited with CDAC – Center for development of Advanced Computing. He holds an... Read More.
Vitaly Gordon is a senior data scientist on the LinkedIn Product Data Science team where he develops data products that most of you use every day. Prior to LinkedIn, Vitaly founded the data science team at LivePerson and worked in the elite 8200 unit, leading a team of researchers in developing algorithms to fight terrorism. His contributions have been recognized through a number of awards including the “Life Source” award, an award given each year deemed most high-impact in saving lives. Vitaly holds a B.Sc in Computer Science and an MBA from the Israeli Institute of Technology.
Brian Granger is an Assistant Professor of Physics at Cal Poly State
University in San Luis Obispo, CA. He has a background in theoretical
atomic, molecular and optical physics, with a Ph.D from the University of Colorado. His current research interests include quantum computing, parallel and distributed computing and interactive computing environments for scientific and technical computing. He is a core developer of the IPython project and is an active contributor to a number of other open source projects focused on scientific computing in Python. He is @ellisonbg on Twitter and GitHub.
Dr. Gray obtained degrees in Applied Mathematics and Computer Science from Berkeley and a PhD in Computer Science from Carnegie Mellon, and is an Associate Professor at Georgia Tech and CTO of Skytree, Inc. His research focuses on scaling up all of the major practical methods of machine learning (ML) to massive datasets. He began working on this problem at NASA in 1993 (long before the current fashionable talk of “big data”). His large-scale algorithms helped enable the Top Scientific Breakthrough of 2003, and have won a number of research awards. He served on the National Academy of Sciences Committee on the Analysis of Massive Data and frequently gives invited tutorial lectures on massive-scale ML at top research conferences and agencies.
Mike is a Senior Analyst at Forrester Research. He researches blazing fast Web architectures (best practices, caching, browsers, Java, .NET, and platforms); distributed caching technologies; complex event processing (CEP) platforms; business rules platforms; user experience (UX) design; and the future of application development. Mike is also a leading expert on the intersection of user experience, application development, and architecture.
Carlos Guestrin is the Amazon Professor of Machine Learning at the
Computer Science & Engineering Department of the University of
Washington. He is also a co-founder and CEO of GraphLab Inc.,
focusing large-scale machine learning and graph analytics. His
previous positions include the Finmeccanica Associate Professor at
Carnegie Mellon University and senior researcher at the Intel Research
Lab in Berkeley. Carlos received his PhD and Master from Stanford
University, and a Mechatronics Engineer degree from the University of
Sao Paulo, Brazil. Carlos’ work has been recognized by awards at a
number of conferences and two journals: KDD 2007 and 2010, IPSN 2005
and 2006, VLDB 2004, NIPS 2003 and 2007,... Read More.
Dr. Gustafson leads the Knowledge Discovery Lab at the General Electric Global Research Center in Niskayuna, New York. The Knowledge Discovery Lab is focused on large-scale data, semantics, ontologies and text mining, and pattern search and discovery.
As a former member of the Machine Learning Lab and Computational Intelligence Lab, he develops and applies advanced AI and machine learning algorithms for complex problem solving.
He received his PhD in computer science from the University of Nottingham, UK, where he was a research fellow in the Automated Scheduling, Optimisation and Planning Research Group. He received his BS and MS in computer science from Kansas State University, where he was a research assistant in the Knowledge Discovery in Databases Laboratory.
Dr. Gustafson is a member of several... Read More.
Rachel Haines is the Data Governance Practice Lead at EMC Corporation. Her recent efforts have focused on the creation of EMC’s propriety Data Governance delivery models and methodologies for the mentoring of clients in the design and implementation of enterprise-wide Data Governance Programs across various industries. Ms. Haines has more than 28 years of experience in end-to-end software development lifecycle and systems design and implementation in roles ranging from project manager and business analyst to software engineer.
Ben Hamner is the Director of Engineering Kaggle. He has worked with machine learning problems in a variety of different domains, including natural language processing, computer vision, web classification, and neuroscience. Prior to joining Kaggle, he applied machine learning to improve brain-computer interfaces as a Whitaker Fellow at the École Polytechnique Fédérale de Lausanne in Lausanne, Switzerland. He graduated with a BSE in Biomedical Engineering, Electrical Engineering, and Math from Duke University.
Mr. Hannon has worked in the field of software development as both an architect and developer for more than 15 years, with a focus on workflow, integration and distributed systems. He is currently a senior software architect at SoftLayer on the product innovation team. He has a passion for leveraging open source solutions to bring real value to the Enterprise space, and has implemented open source solutions with many companies across the globe. Mr. Hannon is also active in mobile application development, with multiple published applications.
Edith is CEO & co-founder of LaunchDark.ly, a lean product development tool. Edith was a Product Director at TripIt and was a founding team member of TripIt for Teams. Edith has more than 10 years of experience with both consumer and enterprise startups. She is co-moderator of the lean startup list. From her work on deployment, she has two patents. Edith received a BS in Engineering from Harvey Mudd College and a degree in Economics from Pomona College.
Chris Harland is a Data Scientist at Microsoft working on problems in Bing search, Windows, and MSN. He holds a PhD in Physics from the University of Oregon and has worked in a wide variety of fields spanning elementary science education, cutting edge biophysical research, and recommendation/personalization engines.
Chris came to Microsoft and data science by way of the University of Chicago where, after completing a post-doc, he founded a data science consulting start-up where he gained a large array of data science skills. Chris enjoys all things data science from blogging and tutorials to operational models that impact product users every day.
Jeffrey Heer is a co-founder and CXO (Chief Experience Officer) at Trifacta Inc. Jeff spent the last many years as a professor of Computer Science at Stanford University, where he led the Stanford Visualization Group. His group has created a number of popular tools, including D3.js (Data-Driven Documents) and Data Wrangler. In Fall 2013, Jeff joined the faculty of Computer Science & Engineering at the University of Washington. In 2009 Jeff was named to MIT Technology Review’s TR35; in 2012 he was named a Sloan Foundation Research Fellow. He holds BS, MS and PhD degrees in Computer Science from the University of California, Berkeley.
Joseph M. Hellerstein is co-founder and CEO of Trifacta, and a Chancellor’s Professor of Computer Science at the University of California, Berkeley. His work focuses on data-centric systems and the way they drive computing. He is an ACM Fellow, an Alfred P. Sloan Research Fellow and the recipient of three ACM-SIGMOD “Test of Time” awards for his research. In 2010, Fortune Magazine included him in their list of 50 smartest people in technology , and MIT’s Technology Review magazine included his Bloom language for cloud computing on their TR10 list of the 10 technologies “most likely to change our world”.
Felienne is a professor and an entrepreneur in the field of spreadsheets. Her PhD thesis centers around techniques to extract information from spreadsheets and present that in a visual way, to support users in improving and understanding them. In 2010 Felienne founded Infotron, a start up that uses the algorithms developed during the PhD project to analyze spreadsheet quality for large companies. In her spare time, Felienne volunteers as a judge for the First Lego League, a world wide technology competition for kids.
Combining advanced analytics with business acumen, Cameran leads the Socialcast by VMware Data Science team. Cameran is responsible for leading all data efforts on the Socialcast product, using big data to inform and create new products as well as mining customers’ community data to help them optimize their investment. A veteran of the business intelligence community, Cameran was previously with Disney Parks and Resorts where she drove significant revenue growth through pricing and product optimization. She holds a degree in Economics/Mathematics from UC Santa Barbara and graduated at the top of her MBA class at UC Irvine.
Steven Hillion is the Chief Product Officer of Alpine Data Labs. He has been leading large engineering and analytics projects for fifteen years. Before joining Alpine, he founded the analytics group at Greenplum, leading a team of Data Scientists and also designing and developing new open-source and enterprise analytics software. Before that, he was vice president of engineering at M-Factor, Inc. (acquired by DemandTec) where he built analytical applications that became a global standard for demand modeling at companies like Coca-Cola. Earlier, at Kana Communications, Mr. Hillion led the engineering group during the two largest releases of its flagship product, which remains the engine for email support at companies like eBay and Staples. At Scopus Technology (later Siebel Systems) he co-founded development groups for Finance,... Read More.
Ben Hindman is one of the creators of Apache Mesos, a platform for building and running resource-efficient distributed systems at scale. Ben started working on Mesos as a PhD student at Berkeley before he brought it to Twitter where it runs on thousands of machines. An academic at heart, his research in programming languages and distributed systems has been published in leading academic conferences.
Felipe Hoffa joined Google in 2011 as a Software Engineer. As a member of the Google Cloud Platform team, he works with external developers to build applications on Google’s big data platforms.
Hacker & Inventor.
Mr. Hunt currently serves as the Chief Technology Officer for the Chief Information Officer in CIA. In this capacity he is responsible for setting the strategic technology direction to enable CIA’s missions, actively engage across the IC to share and communicate IT solutions, and drive solutions for the rapid insertion and adoption of new capabilities to keep pace with technology change in the commercial sector
Previously, Mr. Hunt served as the Director of Applications Services for CIA. In this role, he was responsible for building IT systems to support and enable CIA’s mission to effectively conduct their business and to set the vision and direction for applications development in CIA. He drove the investment and development process to build core and common... Read More.
Bryan Hurd is the Director of Advanced Analytics for the Microsoft Cybercrime Center in Redmond, Washington. He heads a team of the world’s leading experts in Cyber Forensics, Cyber Threat Intelligence, Online Piracy and other efforts to fight global scale cybercrime. Working with Microsoft partners in public sector and private industry, the team uses massive data and advanced technologies to focus investigations and operations around the world.
Bryan Hurd has been building and leading teams in both the public and private sectors, focused on large data problems related to counter-terrorism and cybercrime. He is considered a national thought leader in advanced analytics and innovation in data visualization. He has supported countless local, national and international computer crime investigations and has led cybercrime teams, digital... Read More.
Ian is a data scientist for Pivotal and works on a wide range of customer projects from fraud detection to transport and logistics.
Ian has a background in numerical analysis and simulation and his expertise includes high performance computing for scientific applications, perturbative analysis of large systems of differential equations and the differential geometry underlying relativistic physics.
He completed a PhD in theoretical cosmology at Queen Mary, University of London and received a MSc from Imperial College London in theoretical physics. Ian’s work has been published in leading international physics journals and he has released the Python numerical package used in his research to the community.
Nandu Jayakumar has been working with Big Data for over a decade now. At Yahoo, he is currently building data applications to improve user engagement, and is designing large scale advertising systems. He is also an active contributor to Shark.
His background is in databases and distributed systems. As a senior leader of Yahoo’s well regarded data team he has built key pieces of Yahoo’s data processing tools and platforms through their several iterations.
Nandu holds a Bachelor’s degree in Electronics Engineering from Bangalore University and a Master’s degree in Computer Science from Stanford University.
Alexander Kagoshima received a M.S. in Economics and Engineering from TU Berlin in 2012. In graduate school his focus was on machine learning and statistics. In his bachelor thesis he worked on applying Gaussian Processes to currency exchange rates. For his master thesis, Alex developed and evaluated a change-point detection algorithm that operates on wind data, to enable a new kind of intelligent wind-turbine control systems.
He gained practical experience in the application of machine learning methods as a working student at Volkswagen, his task was to analyze data of a test fleet of fuel-cell cars. Since December 2012 he works as a Data Scientist at Greenplum (now Pivotal) as the first Data Scientist in the EMEA team. In his spare time, he tries... Read More.
Co-founder and Chief Scientist of Marinexplore leading engineering and research efforts to create the ocean’s big data platform.
Previously founded the Data Research Team of Skype, applying data analytics, predictive methods and system optimizations to improve the products. At Skype, he built 3 teams, one of which was responsible for initiating and delivering a number of engineering toolset improvements spanning the entire organization.
Prior to Skype, he founded & built up a Quality Assurance consultancy and developed Electronic Ticketing Services of Tallinn. He was also involved in developing the early versions of Eionet systems for European Environmental Agency.
Direct product analytics function at AutoTrader.com, overseeing testing of new products and measurement of consumer activity on the core website and mobile properties. Lead team of 6 analysts skilled in web analytics, data mining and predictive modeling. Work closely with IT and Business Intelligence to put in place solutions to enable robust analytics. Built web analytics function at Delta Air Lines and led analytics team in support of Direct Marketing and Customer Relationship Management (CRM) programs.
Specialties: SAS/SQL programming, data visualization, web analytics, project management, marketing measurement, writing/editing
MBA Georgia State University; BA Williams College;
Statistics Advisory Board member, Kennesaw State University
Jeffrey F. Kelly is a Principal Research Contributor at The Wikibon Project, an open source research and advisory firm based in Boston. His research focus is the business impact of Big Data and the emerging Data Economy. Mr. Kelly’s research has been quoted and referenced by the Wall Street Journal, the Financial Times, Forbes, CIO.com, IDG News, TechTarget and more. Reach him by email at email@example.com or Twitter at @jeffreyfkelly.
Paul Kent is Vice President of Big Data initiatives at SAS. He spends his time between Customers, Partners and the Research & Development teams discussing, evangelizing and developing software at the confluence of big data and high performance computing.
Nick is the Director of Data Science at Rackspace. He leads their data visualization and machine learning teams focused on data products. In previous dimensions, he was the lead data scientist at one of the worlds largest global management consulting firms where a portion of his time was spent helping design and structure the companies strategy around building data science departments, the other portion was spent starting and incubating data science inside fortune 500 companies. He got his start designing hardware devices for voice controlled medical beds and then became more interested in intelligent non-living things. Later, he designed and implemented scalable backend systems for predictive modeling products as well as a few production recommender systems for large retailers. Nick holds a BS in Statistics... Read More.
Andy Konwinski is a postdoc in the AMPLab at UC Berkeley focused on large scale distributed computing and cluster scheduling. He co-created and is a committer on the Apache Mesos project that has been adopted by Twitter as their private cloud platform. He also worked with systems engineers and researchers at Google on Omega, their next generation cluster scheduling system. More recently, he lead the AMP Camp Big Data Bootcamp and has been contributing to the Spark project.
Marcel Kornacker is a tech lead at Cloudera for new products development and creator of the Cloudera Impala project. Following his graduation in 2000 with a PhD in databases from UC Berkeley, he held engineering positions at several database-related start-up companies. Marcel joined Google in 2003 where he worked on several ads serving and storage infrastructure projects, then became tech lead for the distributed query engine component of Google’s F1 project.
Justin is CEO of Zoomdata, Inc. Prior to Zoomdata, Justin was the co-founder of Clarabridge and the inventor of Clarabridge’s award-winning, patented, text analytics software.
Prior to Clarabridge, Justin co-founded and was CTO of Claraview, a BI strategy and technology consultancy, which was sold to Teradata in 2008. Before founding Claraview, Justin served as founder and CTO of Strategy.com, a real-time data analysis and alerting subsidiary of MicroStrategy.
Prior to launching Strategy.com, he was a technology program manager and consultant at MicroStrategy, designing the second-generation web-based BI tool for MicroStrategy, and working with large customers on their BI deployments.
In the early 1990′s Justin was active in the BBS community, and he authored and marketed the EIS-PC BBS system.... Read More.
Scott Lee is an Information Strategist at EMC Corporation. Working in the intersecting fields of data asset management and enterprise information management (EIM), Mr. Lee works with clients worldwide on their most challenging data monetization, advanced analytics, and information value chain projects. His career spans 17+ years of consulting experience, with a focus in ensuring information value, availability, and quality in challenging climates such as post-merger integration and global IT / process change programs.
Flo was an early engineer at Twitter where he helped build critical infrastructure for doing analytics and search. He was primarily responsible for Twitter’s user search product. After a few years at Twitter, Flo joined Airbnb and built out the data infrastructure team. Flo is an early leader in the Apache Hadoop community and has extensive experience using it at scale. He has published a number of patents and papers in the area of information retrieval and large-scale distributed systems. Flo was also the driver and main author of Chronos, a fault tolerant job dependency scheduler and job orchestration framework built on top of mesos.
Playing with computers since he was young, Tom eventually developed back and wrist pain, so he started studying ergonomics and conducting quantitative ergonomics research. At some point, people started calling him a data scientist. And his back and wrists now hurt less. He recently been playing music (http://csvsoundsystem.com) and studying open data (http://thomaslevine.com/open-data).
Haoyuan Li is a Computer Science Ph.D. candidate in AMPLab at UC Berkeley, and he works with Prof. Scott Shenker and Prof. Ion Stoica on big data and cloud computing. He leads Tachyon, an open source memory-centric distributed file system enabling reliable file sharing at memory-speed across cluster frameworks. He is a founding committer of Apache Spark and a co-creator of Spark Streaming. Before Berkeley, he worked at Conviva and Google, where he co-created PFPGrowth algorithm, which is included in Apache Mahout. Haoyuan has a M.S. from Cornell University and a B.S. from Peking University, both in Computer Science.
Ben Lorica is the Chief Data Scientist at O’Reilly Media, Inc. He has applied Business Intelligence, Data Mining, Machine Learning and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services.
Roger Magoulas is the research director at O’Reilly Media and chair of Strata + Hadoop World conferences. Roger and his team build the analysis infrastructure, and provide analytic services and insights on technology adoption trends to business decision-makers at O’Reilly and beyond. We find what excites key innovators and use those insights to gather and analyze faint signals from various sources to make sense of what others may adopt and why.
Justin Makeig is a Director of Product Management at MarkLogic where he oversees the suite of applications, tools, and APIs built around MarkLogic Server. He manages the company’s Hadoop strategy along with front-end application development and administration tools. Justin has over 10 years of experience designing, developing, and bringing to market data-driven applications for start-ups and large organizations using web and Big Data technologies. He holds an MBA from the University of California, Berkeley.
Dean Malmgren is co-founder and managing partner of Datascope Analytics. As an author of several peer-reviewed publications on big data analytics and visualization, Dean is excited about bringing cutting-edge techniques out of research and into practice. When not teasing himself or others, Dean can be found swimming, cycling, or running for silly long distances. Dean received a BS in math and chemical engineering from the University of Michigan and a PhD in chemical engineering from Northwestern University.
Adam is Locu’s Director of Data. He recently completed his Ph.D. in Computer Science at MIT. His dissertation is on database systems and human computation. He is a recipient of the NSF and NDSEG fellowships, and has previously worked at ITA, Google, IBM, and FactSet. In his free time, he builds course content to get people excited about data and programming.
Mr. Mathur is the Chief Executive Officer of Silicon Valley Data Science, a has brought together a team of world-class data scientists and engineers to help companies become more data driven. The SVDS team consists of experts in analytics, big data and data platforms. Previously, Mr. Mathur was SVP of Product Management for LiveOps. He was part of the executive staff and was responsible for LiveOps’ overall product strategy and roadmap. Mr. Mathur and his team designed and deployed social, mobile, multichannel and analytic applications into the LiveOps Cloud Platform. Prior to LiveOps, Mr. Mathur was a Partner in Accenture’s R&D organization, Accenture Technology Labs. He led a global team that delivered market-ready business solutions built using emerging technologies to Accenture’s clients. He... Read More.
LP Maurice is the CEO & co-founder of Busbud, a venture-backed e-commerce startup focused on intercity bus travel. Previously, LP worked in Silicon Valley at Yahoo, LinkedIn and Radar Networks (acquired) in product management, marketing and business development. LP holds an MBA from Harvard University, a Master’s degree in Corporate Law from the University of Montreal and a Bachelor’s degree in Finance from HEC Montreal. LP is a 2014 NYC Venture Fellow. LP is also co-founder of entrepreneur group Entrepreneurs Anonymous, managing director of seed fund Interaction Ventures, founder of non-profit project GeoDonation and co-organizer of the Startup Open House Montreal. LP mentors at Founder Fuel, Startup Weekend and The Next 36. He recently spoke at TedX and serves as VP... Read More.
Christy has 14 years of experience with technology marketing and 12 years at IBM. She was one of the founding members of IBM’s Institute for Business Value, which provides leading edge thought leadership and practical insights for business executives.
Prior to joining IBM, Ms. Maver worked as an insurance industry consultant at a small dot-com that was eventually acquired by IBM. She holds a BA in Economics from Princeton University. Ms. Maver is also a certified Bikram Yoga instructor who currently teaches at the Downtown Los Angeles studio.
Pat McDonough has been an avid proponent and user of open source since the days when the average corporation still thought it was a crazy idea to use it for legitimate business. For the last half decade, he’s been working at companies built around open source, helping their customers build platform solutions for applications, integration, and big data. Pat works at Databricks with the core team from the Apache Spark project, working every day to grow the Spark community and user base.
Patrick McFadin is regarded as one of the foremost experts of Apache Cassandra and data modeling techniques. As the Chief Evangelist for Apache Cassandra and consultant for DataStax, he has helped build some of the largest deployments in the world. Previous to DataStax, he was Chief Architect at Hobsons, an education services company. There, he spoke often on Web Application design and performance.
Innovating analytics and data visualization tools. Author of “Python for Data Analysis” from O’Reilly Media. Author of pandas library, contributor to statsmodels. Founder and CEO of DataPad.
David McRaney is a journalist who created the blog You Are Not So Smart where he began writing regularly about the psychology behind common biases, delusions, heuristics, and fallacies in 2009. That blog became an internationally bestselling book published by Penguin/Gotham in 2011, now available in 14 languages. His second book, You Are Now Less Dumb, was released in July of 2013, also published by Penguin/Gotham.
David currently hosts a podcast that is part of the Boing Boing podcasting family soon to be available on Sirius/XM radio and on Virgin Airlines, and he travels around the planet giving lectures on the topics he covers in his books, blog, and podcast.
David graduated with a degree in journalism from The University of Southern Mississippi where he... Read More.
As a senior software engineer at Metamarkets, Gian is responsible for the infrastructure behind its data ingestion pipelines. He comes to Metamarkets from Yahoo! where he was responsible for its worldwide server deployment and configuration management platform. He holds a BS in Computer Science from California Institute of Technology.
Geoffrey Moore is an author, speaker and business advisor to many of the leading companies in the high-tech sector, including Cisco, Cognizant, Compuware, HP, Microsoft, SAP, and Yahoo!.
Geoffrey divides his time between consulting on strategy and transformation challenges with senior executives and speaking internationally on those same topics. His latest book Escape Velocity: Free Your Company’s Future from the Pull of the Past, keeps this intent in mind and is the result of his years of experience working with large enterprises.
Escape Velocity is Moore’s sixth book for business leaders in the high-tech sector. His first book, Crossing the Chasm, which addresses the challenges of gaining initial adoption for disruptive innovations, continues to be a best seller and required reading in business schools... Read More.
Narendra Mulani is the senior managing director—Accenture Analytics. In this role, he is responsible for driving Accenture’s strategic agenda for growth across business analytics – from issue to outcome, including data, analytics, insights and actions – and accountable for ensuring sales and revenue growth across all of the industries, business functions and geographies in which Accenture operates. Narendra is also the co-lead of The Accenture and MIT Alliance in Business Analytics, a unique partnership that combines Accenture’s industry and analytics expertise with MIT’s scientific and technological leadership to address clients’ business problems.
Narendra has held a series of leadership roles within Accenture since joining in 1997. Most recently, he was the managing director—Products North America where he was responsible for creating value for... Read More.
Rodney Mullen is widely considered the most influential skateboarder in the history of the skateboarding. The majority of ollie and flip tricks he invented throughout the 1980’s, including the flatground ollie, the Kickflip, the Heelflip, and the 360 flip are regularly done in modern vertical and street skateboarding.
Despite Alan Gelfand’s justifiable fame for inventing the ollie air (Gelfand’s maneuver being primarily a vert or pool oriented trick) Mullen is responsible for the invention and development of the street ollie. The ability to pop the board off of the ground and land back on the board while moving has quite likely been the most significant development in modern skateboarding. This invention alone would rank Mullen the most important skateboarder of all time.
John Rodney... Read More.
Scott Murray is a code artist who writes software to create data visualizations and other interactive phenomena. His work incorporates elements of interaction design, systems design, and generative art. Scott is an Assistant Professor of Design at the University of San Francisco, where he teaches data visualization and interaction design. He is a contributor to Processing, and is author of the forthcoming O’Reilly title “Interactive Data Visualization for the Web”. Scott earned an A.B. from Vassar College and an M.F.A. from the Dynamic Media Institute at the Massachusetts College of Art and Design. His work can be seen at alignedleft.com.
Sambavi Muthukrishnan is an Engineering Manager in the Analytics Infrastructure group at Facebook. She leads the development of batch analytic engines for one of the largest data warehouses in the world. Her team works on the core of Hive and Hadoop (Corona) – evolving the query engine, compute framework and storage to operate at scale; as well as on Giraph (graph analytics). Previously, she was an Engineering Manager at Microsoft in the SQL Server group where she worked on query processing.
Recent talks by Sambavi:
Paco Nathan is the Chief Scientist at Mesosphere in SF, and a “player/coach” who’s led innovative Data teams building large-scale apps for the past decade. He is a recognized expert in distributed systems, machine learning, predictive modeling, and cloud computing. He received his BS Math Sciences and MS Computer Science degrees from Stanford, and has 25+ years experience in the tech industry ranging from Bell Labs to early-stage start-ups.
Paco is an evangelist for the Mesos and Cascading open source projects, and is also an O’Reilly author for “Enterprise Data Workflows with Cascading”.
Owen has been contributing to Apache Hadoop since before it was first called Hadoop. He was the first committer added to the project and has provided technical leadership on MapReduce, and security. Using Hadoop in 2008 he set the world record for sorting a terabyte of data in 3.5 minutes and in 2009 he sorted a petabyte in 16.25 hours. In 2011, Own co-founded Hortonworks, which commercially supports and trains users of the Hadoop ecosystem. Prior to Hortonworks, Owen worked on Yahoo! Search’s WebMap project, which built the know web. Once ported to Apache Hadoop, it became the single largest low Hadoop application.
Herain is the Director of product marketing for Microsoft’s Business Intelligence, Data Warehousing and Big Data solutions. In his prior roles at Microsoft Herain has led product marketing and product planning for SQL Server’s Business Intelligence and developer initiatives.
Prior to Microsoft, Herain held product marketing and solutions architect roles at BEA Systems (now Oracle) where he played a key role in launching BEA’s WebLogic Platform. Prior to BEA, Herain was a starting employee of ‘The Theory Center’, a Boston based startup for e-commerce solutions and was later acquired by BEA.
Herain has a B.S. in Computer Engineering from Carnegie Mellon University and an M.B.A from Cornell University. He lives in Redmond where he enjoys spending time with his... Read More.
Matt Ocko has three decades of experience as a technology entrepreneur and VC, in the US and globally. His prior investments include Cotendo (AKAM), Zynga (ZNGA), Facebook (FB), XenSource (CTRX), UltraDNS (NSR), FlashSoft (SDNK), Fortinet (FTNT), Aggregate Knowledge (NSR), Virtuata (CSCO), DataMirror (IBM), Couchbase, Ayasdi, Kenshoo, D-Wave Systems, MetaMarkets, Uber, AngelList, and many others, including multiple additional acquisitions by Google, Facebook, Netapp, and other Fortune 1000 tech companies. Matt founded Da Vinci Systems, a pioneering e-mail software vendor with over 1 million users world-wide prior to its acquisition. He is an inventor on over 40 granted or in-process patents in areas as diverse as enterprise hardware and social games. He holds a degree in Physics from... Read More.
A leading expert on big data architecture and Hadoop, Stephen brings over 20 years of experience creating scalable, high-availability, data and applications solutions. A veteran of WalmartLabs, Sun and Yahoo!, Stephen leads data architecture and infrastructure.
Fernand is the in-house Data Scientist at Change.org, where he is responsible for designing scalable algorithms, data backends and experiments to solve business problems and improve key metrics.
Balaji is founding member of the CloudPhysics team and currently runs the Engineering organization here. Prior to this Balaji has been with VMware for 5 years leading the platform management SDK team. Balaji is a regular speaker at VMworld, Technology Exchange, and various VMware User Group conferences.
Rahul Pathak runs the Amazon EMR and AWS Data Pipeline businesses for AWS. Amazon EMR is a web service for running frameworks like Hadoop, Spark, and Presto on managed clusters in the cloud. AWS Data Pipeline is web service for orchestrating data flows between services and data stores. During his time at AWS Rahul has focused on managed data and analytic services. Prior to EMR and Data Pipeline, he was the Principal Product Manager for Amazon Redshift, a fast, fully managed, petabyte-scale data warehouse service in the cloud. He has also worked on Amazon ElastiCache, Amazon RDS, and Amazon RDS Provisioned IOPS. Rahul has over fifteen years of experience in technology and has co-founded two... Read More.
Pamela Peele, Ph.D., is the Chief Analytics Officer of the UPMC Insurance Services Division. Dr. Peele brings 13 years of patient care experience along with 12 years of academic research experience to her position as the leader of health care analytics at the Health Plan. She is responsible for data analytic activities, economic modeling, predictive modeling, statistical analysis, and machine learning. Her work focuses on the application of economic and statistical models to improve the health and welfare of populations. Prior to joining the Health Plan in 2006, Dr. Peele was the Vice Chair of the Department of Health Policy & Management at the University Of Pittsburgh Graduate School Of Public Health. She currently holds faculty appointments in Health Policy & Management and in... Read More.
Dr Srinath Perera, is a Director of Research at WSO2 Inc., where he overlooks the overall WSO2 platform architecture with the CTO. He is a co-founder of Apache Axis2, a member
of the Apache Software foundation, and a member of the Apache Web Service Project Management Committee. Srinath also serves as a research scientist at the Lanka Software Foundation and teaches as visiting faculty at the Department of Computer Science and
Engineering, University of Moratuwa. He is a frequent technical speaker and author of many academic and technical publications. He has authored two books “Hadoop Cookbook” and “Instant MapReduce Patterns.”
Fernando Pérez is a research scientist at UC Berkeley, working at the
intersection of brain imaging and open tools for scientific computing. He
created IPython while a PhD student in Physics at the University of Colorado in
Boulder. Today, with all the hard work done by a talented team, he continues
to lead IPython’s development as the interface between the humans at the
keyboard and the bits in the machine.
He is a founding member of NumFOCUS, a PSF member, and received the 2012 Award for the Advancement of Free Software for IPython and contributions to
Ramona is the CEO and co-founder of Declara, a technology company focused on adult learning.
At the age of 22, Ramona Pierson was hit by a drunk driver while running and spent 18 months in a coma and 11 years blind. She relearned everything from breathing and seeing to walking and smiling. From this experience arose the drive to build companies where personalized learning and relearning were the center of all products.
She founded SynapticMash, a developer of educational solutions. It was acquired by Promethean World and Ramona became the Chief Science Officer leading its global expansion. In 2005, she created The Source, one of the first online social learning solutions for educators, students and parents. It was adopted throughout the Seattle Public School... Read More.
My personal Bio can be seen at
Jake Porway is a machine learning and technology enthusiast who loves nothing more than seeing good values in data. He is the founder and executive director of DataKind, an organization that brings together leading data scientists with high impact social organizations to better collect, analyze, and visualize data in the service of humanity. Jake was most recently the data scientist in the New York Times R&D lab and remains an active member of the data science community, bringing his technical experience from his past work with groups like NASA, DARPA, Google, and Bell Labs to bear on the social sector. Jake’s work has been featured in leading academic journals and conferences (PAMI, ICCV), the Guardian, the Stanford Social Innovation Review, and... Read More.
Rachel is a Data Scientist for Silicon Valley Data Science with a focus in Statistics and Communication. She has worked for American Express and TiVo and held the roles of Sr. Statistical Analyst and Analytics Product Manager. In her career, she has applied statistical techniques such as Statistical Process Control, Hierarchical and K-means Clustering, Principal Components, Logistic and Linear Regression, and Survival Analysis. At American Express, she used generalized additive models to optimize ROI for varying marketing campaigns. While at TiVo, she worked on projects benchmarking models for Ad Click Thru Rates, improving their Recommender system, analyzing patterns of user behavior, and building a monitoring system for measuring software responsiveness and detecting regressions. Rachel has a Master’s degree in Statistics, and a Bachelor’s in... Read More.
As the director of research at the Human Rights Data Analysis Group, Megan Price designs strategies and methods for statistical analysis of human rights data for projects in a variety of locations including Guatemala, Colombia, and Syria. Her work in Guatemala includes serving as the lead statistician, since 2009, on a project in which she analyzes documents from the National Police Archive; she has also contributed analyses submitted as evidence in two court cases in Guatemala. Her work in Syria includes serving as the lead statistician and author on two recent reports, commissioned by the Office of the United Nations High Commissioner of Human Rights (OHCHR), on documented deaths in that country.
Megan is a Research Fellow at the Carnegie Mellon University Center for... Read More.
Fascinated by the “craft” of software development, Eric Pugh has been heavily involved in the open source world as a developer, committer, and user for the past 5 years. He is an emeritus member of the Apache Software Foundation and lately has been mulling over how we move from the read/write web to the data web. In biotech, financial services and defense IT, he has helped European and American companies develop coherent strategies for embracing open source software. Eric became involved in Solr when he submitted the patch SOLR-284 for Parsing Rich Document types such as PDF and MS Office formats that became the single most popular patch as measured by votes! He co-authored Solr Enterprise Search Server.
Matt Quinn has been with TIBCO for 14 years. During this time he has had several worldwide roles. As CTO, Mr. Quinn works with all product groups to create a common, corporate-wide vision for all of TIBCO’s products and technologies; ensures interoperability between TIBCO’s various products families, as well as consistent architectural approaches across all groups; and provides overall leadership and coordination of TIBCO’s product plans and technology direction. Up until his new role as CTO, Mr. Quinn has been responsible for the Composite Application Group (CAG). This group encompasses TIBCO’s SOA, BPM, Infrastructure, Monitoring and Management, Governance and User Experience technologies. This group is responsible end-to-end for the engineering, quality, delivery of product, product vision, and customer enablement.... Read More.
Dr. Radinsky is the CTO and cofounder of SalesPredict, a sales technology company, where she is pioneering artificial intelligence based, predictive analytics solutions that transform the way companies do business.
Dr. Kira Radinsky is one of the up and coming voices in the data science community, pioneering the field of Web Dynamics and Temporal Information Retrieval. Her work focuses on the intersection of predictive data mining and the construction of algorithms that leverage web-found information and external dynamics to predict future events. A graduate of the Technion-Israel Institute of Technology, Dr. Radinsky gained international recognition for her work there and at Microsoft Research where she developed predictive algorithms that recognized the early warning of globally impactful events, (e.g. riots or diseases.)
In 2013, Kira... Read More.
Krishna Raj Raja is currently a member of Cloudphysics founding team. Prior to this he has been with VMware for over 10 years specifically focussing on virtualization core platform performance. Krishna has given several popular talks at VMware VMworld conferences.
Yann is the technical lead on the Twitter Observability group, where he is responsible for guiding the growth and scale for the Observability service for all of Twitter. Yann is a software engineer with over 10 years of experience building large scale distributed monitoring systems, including thousand node wireless sensor network and control systems, to large scale software and service monitoring stacks.
Rich Raposa, Sr. Curriculum Developer at Hortonworks, has been an author and trainer for over 15 years, having published several programming books and travelled the country teaching software development at companies of all sizes. He joined Hortonworks in July of 2012 and has created their Hadoop 2.0 developer curriculum and certification exams.
Christopher (Chris) Re is an assistant professor in the Department of Computer Science at Stanford University. The goal of his work is to enable users and developers to build applications that more deeply understand and exploit data. Chris received his PhD from the University of Washington in Seattle under the supervision of Dan Suciu. For his PhD work in probabilistic data management, Chris received the SIGMOD 2010 Jim Gray Dissertation Award. Chris’s papers have received four best-paper or best-of-conference citations, including best paper in PODS 2012, best-of-conference in PODS 2010 twice, and one best-of-conference in ICDE 2009). Chris received an NSF CAREER Award in 2011 and an Alfred P. Sloan fellowship in 2013.
Ben is a software developer at Citus Data. Prior to working at Citus Data, Ben spent over 8 years at Amazon working on diverse areas such as personalization, search relevance, and distributed website rendering. His most recent 4 years at Amazon were spent working on AWS where was an early developer on CloudFront and a founding member of the Route 53 team. Ben holds a BMath in Computer Science from the University of Waterloo and has come a long way from his humble upbringing on the mean streets of Canada.
Data Scientist focused on international development and emerging markets. Has worked with foundations, governments, non-profits, and private firms large and small around the world to help them monitor, evaluate, and communicate their impact. Recognized DataKind “Data Hero.”
Jesse Robbins is a technology founder & innovator with a unique background as a firefighter and emergency manager. Robbins is widely recognized for transforming the way companies like Facebook & Yahoo manage complex internet systems, and for his work helping governments and humanitarian organizations embrace new technologies.
Prior to OnBeep, Robbins founded Chef which makes software used by companies like Facebook, Google, and Yahoo to automates servers & infrastructure. He also founded the Velocity Conference and is credited as helping to start the DevOps movement.
Before founding Chef, Robbins served as Amazon’s “Master of Disaster” where he was responsible for website... Read More.
Monica is a data scientist with a passion for turning data into products, actionable insights, and meaningful stories. As the VP of Data for Jawbone, she focuses on developing data-driven products that promote a healthier lifestyle and on finding stories in the UP wristband data.
Prior to Jawbone, Monica was one of the early members of the LinkedIn data science team, where she developed and improved some of LinkedIn’s key data products for matching jobs to passive candidates, discovering people you may know, and recommending groups you may like.
Monica’s compelling data stories are often picked up by the mainstream press, including the Wall Street Journal, The Economist, NPR and CNN. Monica holds a Ph.D. in Computer Science from CMU, where... Read More.
Rob Rosen leads Big Data Go-to-Market for industry-leading analytics and data integration software supplier Pentaho. He has led Big Data initiatives and field technical teams for a number of software and infrastructure vendors, most recently Hadoop distributor MapR Technologies. Prior to MapR, Rob led a variety of pre- and post-sales teams, most notably for storage infrastructure leader NetApp, enterprise security solutions provider Check Point Software Technologies and Unix pioneer Sun Microsystems. He holds a B.S. in Electrical Engineering and Computer Science from the University of California, Berkeley.
Perry J. Samson—is Arthur F. Thurnau Professor in the Department of Atmospheric, Oceanic and Space Sciences and in the Center for Entrepreneurship at the University of Michigan. Perry is the recipient of the College of Engineering Excellence in Teaching Award, 2009 Teaching Innovation Award and the 2010 Distinguished Professor of the Year in the State of Michigan. Professor Samson is an entrepreneur as co-founder of LectureTools (http://www.lecturetools.com) and the Weather Underground (http://www.wunderground.com).
Sriram Sankar is a Principal Staff Engineer at LinkedIn, where he is leading the development of our next-generation search infrastructure. Before that, he led Facebook’s search quality and ranking efforts for Graph Search. He previously worked at Google on search quality and ads infrastructure and held senior technical roles at VMware, WebGain, and Sun. He was a key contributor to Unicorn, the index powering Facebook’s Graph Search, and developed JavaCC, the leading parser generator for Java. He is a graduate of the Indian Institute of Technology in Kanpur.
John Santaferraro is the Vice President of Marketing for Actian Analytics Platform. Prior to joining Actian, Santaferraro was an independent industry analyst in the business intelligence and analytics market. Before that he developed and executed a vertical market strategy for Hewlett Packard’s business intelligence group, focusing on energy, communications, retail, healthcare and financial services. At Hewlett Packard, he was also instrumental in helping establish the new business intelligence business group with a combination of solutions, products, and consulting. In 2000, John founded a marketing and sales consulting company, Ferraro Consulting, providing business acceleration strategy for technology companies. Along with business intelligence executive positions in Compaq Computers and Tandem Computers, Santaferraro co-founded a venture-backed, data warehouse startup company, Virtual Integration Technology, that was later sold to... Read More.
Brett has over 20 years of industry experience including leadership roles at DuPont, Lockheed Martin, Exelon Nuclear and General Electric, where he was Global Sales Leader for the GE Energy T&D Products division. He is Six Sigma Black Belt certified and a graduate of the Naval Nuclear Power School. Brett holds a B.S. in Electrical Engineering from Widener University, an M.S. in Nuclear Engineering from Rensselaer Polytechnic Institute and an MBA in International Business from Georgia State University.
Lumasense Technologies is an innovator in sensor technology and has worked with the electrical industry for over 30 years, including the world’s leading power producers and energy transmitters. Customers include Southern Company, China State Grid, PG&E, MSETCL (India) and National Grid (U.K.)
LumaSense helps customers... Read More.
Mohit Sati is a Senior Manager, Data and Analytics at Ask.com with Search Engine Maketing (SEM) group. Mohit is responsible for building solutions using InfiniDB and Apache Hadoop ecosystem including HDFS, Hive and Pig. Mohit has extensive experience in both SQL and NoSQL databases. In addition, Mohit’s team also provides solutions using MySQL and Redis. He holds an MBA in Marketing from Santa Clara University
John Schitka is a Solution Marketing Manager on the SAP Big Data Solution Marketing team. His focus in the SAP Big Data arena is largely on Hadoop and SAP HANA smart data access capabilities. A graduate of McMaster University, he holds an MBA from the University of Windsor. He has worked in product marketing and product management in the high tech arena for a number of years, taught at a private college and has co-authored a number of published text books. He has a true love of technology and all that it has to offer the world.
Krista Schnell is a Thought Leadership Content Producer in Accenture’s Technology Labs and a co-author of the Accenture Technology Vision, which influences Accenture’s own technology investments and helps direct technology investment strategies at top-tier clients over a 3-5 year horizon.
Prior to joining the Technology Vision team, Krista worked with the Data Insights R&D group at the Accenture Technology Labs, where she specialized in researching and developing data visualization solutions. Krista has a BS in mechanical engineering from Stanford University.
John has served as MapR’s Chief Executive Officer and Chairman of the Board since founding the company in 2009. Prior to founding MapR, John held executive positions in a number of enterprise software companies with a focus on data, storage and business intelligence at both private and public companies including: CEO of Calista Technologies (now Microsoft), CEO of Rainfinity (now EMC), SVP of Products and Marketing at Brio Technologies (BRYO) and General Manager at Compuware (CPWR).
Baron is founder and CEO of VividCortex, the best way to see what your production MySQL servers are doing. He is the lead author of High Performance MySQL and a variety of open-source software.
Chris Selland is vice president of marketing for Vertica, an HP company. In this role, he leads global marketing for the HP Vertica Analytics platform.
Selland has more than 20 years of experience in online, search and inbound marketing programs.. He has also led strategic alliance and corporate development initiatives for entrepreneurial, high-growth companies.
Selland is an established thought leader, speaker and author on customer strategy-related topics, including social media analytics and marketing, CRM, customer analytics, metrics and loyalty.
Earlier in his career, Selland was vice president of CRM and Internet Research at the Yankee Group, and later he founded Reservoir Partners, a customer strategy research firm that merged with Aberdeen Group. He is also a founding member of the Enterprise Irregulars.
He... Read More.
Before coming to Qubole Shrikanth worked at Oracle for 11+ years. His last position at Oracle was as Director of Development in the BI team where among other things he was one of the main contributors to the Oracle Exalytics effort and helped drive the product from conception to release. Before that he worked in the Database team in the SQL/DSS group where he made significant contributions to many different portions of the Oracle stack ranging from Partitioning, SQL Optimization and SQL/Parallel Execution all the way to the Indexing and Data layers.
Vin Sharma is responsible for ecosystem enablement and strategic marketing for Apache Hadoop at Intel. In his previous role, Vin helped enable enterprise adoption of Linux, KVM, and OpenStack on Intel Architecture, and represented Intel on the boards of the Open Virtualization Alliance and the OpenStack Foundation. Before joining Intel in 2011, Vin held various engineering and product management roles at HP for 15 years, helping develop and market enterprise software products based on Linux, Java, XML, and other open source software.
Loves open source software
Max Shron is a New York-based data strategist. He provides expertise and mentorship to organizations across a wide array of sizes and industry verticals. He spearheads complex projects and provides a critical look at how organizations use data. Max was previously the lead data scientist at OkCupid, where he did data work for the widely read OkTrends blog. His personal work has appeared worldwide, including in the New York Times, Chicago Tribune, Huffington Post and on WNYC. He holds a degree in Mathematics from the University of Chicago.
Patrick Shumate is currently busy developing and deploying the next generation of content delivery networks for Comcast Cable. Prior to Comcast he provided security consulting to US Government Agencies and was the Senior Architect for RSA Consumer Solutions devision, bring you such hits as the Go ID – federated two factor authentication and RSA eFraudNetwork.
Shawn works in developer relations at Google as part of the Knowledge team.
Noelle Sio has a background in mathematics, statistics, and data mining with an emphasis on digital media. She is currently a Senior Data Scientist at Pivotal. Her work has mainly focused on helping companies extend their analytical capabilities by exploring and modeling digital data; from enabling a digital media agency to hypertarget their online campaigns to discovering new insights to online conversion drivers for a large retail bank. Previously, she worked as a researcher at eHarmony and Fox Interactive Media, where she leveraged massive datasets up to the petabyte level for marketing optimization, fraud detection, and ad monetization products. Noelle holds an A.B. From Washington University in St. Louis in Applied Mathematics and Physical Anthropology and a M.S. in Applied Mathematics from Cal Poly Pomona.... Read More.
Peter Sirota is the General Manager of Amazon Elastic MapReduce, a managed Hadoop web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. Before starting Amazon Elastic MapReduce, Peter led AWS Platform teams responsible for billing, authentication, portal, and Amazon DevPay services. Peter holds a bachelor’s degree in computer science from Northeastern University.
Laurie Skelly is a Data Scientist at Chicago-based data consulting firm Datascope, and a curriculum developer and instructor for the Data Science bootcamp at Metis, Kaplan’s school for new economy skills training. Connecting a real-world problem to its ideal technical solution is an art and a science. At Datascope, Laurie has led and contributed to projects for clients ranging from international Fortune 50 giants to regional nonprofits, representing a broad collection of business sectors and verticals. For her PhD in social neuroscience from the University of Chicago, she employed machine learning algorithms to model the neural circuitry of emotion in incarcerated psychopathic offenders.
Dr. Marc A. Smith
Chief Social Scientist
Connected Action Consulting Group
Marc Smith is a sociologist specializing in the social organization of online communities and computer mediated interaction. He founded and managed the Community Technologies Group at Microsoft Research in Redmond, Washington and led the development of social media reporting and analysis tools for Telligent Systems. Smith leads the Connected Action consulting group and lives and works in Silicon Valley, California.
Smith is the co-editor with Peter Kollock of Communities in Cyberspace (Routledge), a collection of essays exploring the ways identity; interaction and social order develop in online groups. He is co-author, along with Ben Shneiderman and Derek Hansen of a book Analyzing Social Media Networks... Read More.
Shannon Spanhake is the Deputy Innovation Officer for the City & County of San Francisco in the Office of Mayor Edwin M. Lee. She aims to drive economic development with innovation, achieve diversity and inclusion in tech and make government more responsive and efficient. Prior to this role, she was at a startup founded with her patented civic technology, which she was recognized in “100 Women Innovating Science and Technology” by the Grace Hopper Foundation and was a semifinalist in the Buckminster Fuller Inventor competition. Additionally, she has worked in India, Peru, Mexico and other emerging economies to unleash the transformative power of innovation to solve complex problems.
Evan Sparks is a PhD Student in the Computer Science Division at UC Berkeley. His research focuses on the design and implementation of distributed systems for large scale data analysis and machine learning. Prior to Berkeley he spent several years in industry tackling large scale data problems as a Quantitative Financial Analyst at MDT Advisers and as a Product Engineer at Recorded Future. He holds a bachelor’s degree from Dartmouth College.
I do big data, social systems, and gameplay. I like wicked problems. Everything’s people with me: cognition, culture, and the intelligence processes of the global brain.
I used to be a field anthropologist. I’ve lived in five developing countries. Lately I’ve been working in game analytics.
Right now, I’m thinking a lot about the adoption of big data, and how to make that process effective.
Julie thinks in metaphors and finds beauty in the clear communication of ideas. She is particularly drawn to visual media as a way to understand and transmit information, and is co-author of Beautiful Visualization (O’Reilly 2010) and Designing Data Visualizations (O’Reilly 2012).
Jim Stogdill heads up O’Reilly’s Radar and Strata businesses. A lifelong technology practitioner he’s finding this media thing ridiculously fun. In a previous life he traveled the world with the U.S. Navy. Unfortunately from his vantage point it all looked like the inside of a submarine. He spends his free time hacking silver halides with decidedly low-tech gear. @jstogdill.
Ion Stoica is a Professor of Computer Science at UC Berkeley, where he does research on cloud computing and networked computer systems. Past work includes the Dynamic Packet State (DPS), Chord DHT, Internet Indirection Infrastructure (i3), declarative networks, replay-debugging, and multi-layer tracing in distributed systems. His current research includes resource management and scheduling for data centers, cluster computing frameworks, and network architectures. He is the recipient of a SIGCOMM Test of Time Award, the CoNEXT Rising Star Award, the PECASE Award, and the ACM doctoral dissertation award. Ion also co-founded Conviva, a startup to commercialize technologies for large scale video distribution.
Ronan Stokes is a Solutions Architect at Cloudera where he architects, designs, and builds Hadoop based big data solutions for Cloudera’s customers.
Albert Strasheim is a Systems Engineer at CloudFlare. His passion is big data processing and he’s excited to be working on log processing and analytics at CloudFlare. He has been using Go since its initial public release, but has also done a lot of work with Java, Python and C++. As a new resident of San Francisco, he’s looking forward to surfing up and down the California coast, which is really the reason he came to San Francisco in the first place.
Prior to CloudFlare, Albert studied at the University of Stellenbosch in South Africa, where he focused on Computer Science and Electronic Engineering. He did post-graduate research on speaker recognition systems and worked on them for Agnitio in Madrid, Spain. In South Africa, Albert... Read More.
Mike Stringer is co-founder and managing partner of Datascope Analytics, a consulting and design firm, where he has lead or contributed to projects across a variety of industries for clients including Procter & Gamble, Thomson Reuters, and other leading companies. Mike is passionate about realizing the potential for data to be used as a resource to make a positive impact on business and society.
He also enjoys decidedly non data-oriented activities, including exploring the amazing food in Chicago, playing and listening to music, and generally making things from scratch. Mike received a BS in Engineering Physics from the University of Colorado and a PhD in physics from Northwestern University.
Drew Sullivan is a veteran journalist who has worked for almost a decade in Eastern Europe and Eurasia. He founded the Center for Investigative Reporting in Bosnia-Herzegovina in 2004 and served as director, and editor. He co-founded and served as the first director of the Organized Crime and Corruption Reporting Program, a regional consortium of investigative centers, where he now serves as editor. He founded the Journalism Development Network, an innovative media development organization that uses technology to change investigative reporting worldwide. As a journalist, he led a team of reporters looking at corruption by the Bosnian prime minister which led to his eventual indictment and resignation. His work has been awarded the Daniel Pearl Award (twice), the Online Journalism Award for investigative reporting (twice),... Read More.
Jagane Sundar has extensive big data, cloud, virtualization, and networking experience and joined WANdisco through its acquisition of AltoStor, a Hadoop-as-a-Service platform company. Before AltoStor, Jagane was founder and CEO of AltoScale, a Hadoop and HBase-as-a-Platform company acquired by VertiCloud. His experience with Hadoop began as Director of Hadoop Performance and Operability at Yahoo! Jagane has such accomplishments to his credit as the creation of Livebackup, an open source project for KVM VM backup, the development of a user mode TCP Stack for Precision I/O, the development of the NFS and PPP clients and parts of the TCP stack for JavaOS for Sun MicroSystems, and the creation and sale of a 32bit VxD based TCP Stack for Windows... Read More.
I received my PhD in 2012 from the University of Toronto working with Geoffrey Hinton. After a brief postdoc with Andrew Ng, I cofounded DNNResearch with Geoffrey Hinton and Alex Krizhevsky, which was acquired by Google from stealth mode. I am interested in all aspects of neural networks and their applications.
Ameet Talwalkar is an NSF post-doctoral fellow in the Computer Science Division at UC Berkeley. His work focuses on devising scalable machine learning algorithms, and more recently, on interdisciplinary approaches for connecting advances in machine learning to large-scale problems in science and technology. He obtained a bachelor’s degree from Yale University and a Ph.D. from the Courant Institute at New York University.
Tutti leads Trifacta’s user experience team, bringing her years of experience in consumer design to revolutionize the enterprise software space. She previously held creative leadership positions at various studios including Method and The Designory Interactive. Her human-centered design work spans the range of product vision / strategy through detailed execution for brands such as Google, Samsung, Disney, JP Morgan Chase, and Oracle.
Eddie serves as Vice-Chair in the City of Oakland’s Public Ethics Commission and is the co-founder and Director of Technology for OpenOakland, a non-profit that works closely with Oakland’s City Hall promote civic innovation and open government.
Eddie is the CEO/Co-Founder of Civic Insight, a civic tech startup focused on making municipal property information available to the public in real-time. Eddie has 10 years of experience bringing innovative thinking to civic institutions. Previously, Eddie was a 2012 Code for America fellow and co-creator of BlightStatus for the City of New Orleans.
Before Code for America, Eddie built Digress.it, an online community and open-source project that allows for paragraph-by-paragraph commenting on complex texts. Digress.it is now used by universities, governments and libraries across the country.... Read More.
Wayne Thompson is the Chief Data Scientist at SAS. He is described as one of the early pioneers of business predictive analytics and is globally renowned presenter, teacher, practitioner and innovator in the field of predictive analytics technology.
He has worked alongside the world’s biggest and most challenging organizations to help them harness analytics to build high performing organizations. Over the course of his 20-year tenure at SAS he has been credited with bringing to market landmark SAS analytics technologies (SAS Text Miner, SAS Credit Scoring for Enterprise Miner, SAS Model Manager, SAS Rapid Predictive Modeler, SAS Scoring Accelerator for Teradata, and SAS Analytics Accelerator for Teradata).
Current focus initiatives include easy to use self-service data... Read More.
Sebastian is co-founder and CEO of Udacity. He is also a Research Professor at Stanford University and a Google Fellow, as well as the inventor of the autonomous car and project lead on Google Glass. Sebastian has been named the 5th Most Creative Person in Business (Fast Company), among the 50 Smartest People in Tech(Fortune), and highlighted in 50 Best Inventions of 2010 (Time).
Pranay is a Senior Solutions Architect at Impetus Technologies. He has been a key part of the Impetus R&D labs working on several Big Data initiatives for the last two and a half years. He has fourteen years of experience in the software industry spanning different domains including Telecom, Networking and Big Data. He has been instrumental in providing solutions to complex problems for customers especially in the Big Data and data analytics space.
Kai Trepte is the lead software engineer for the Harvard Clean Energy Project. Kai was instrumental in translating the raw data, over 400TB of data on 2.3 million compounds, into an online data-store open to the world. Kai obtained a Masters in Logistics from MIT and was co-founder of John Galt Solutions, Inc., a supply chain management software provider with over 5,000 customers throughout the world. As the lead engineer on the Clean Energy Project Kai is applying his data warehousing and analytic skills to big-data in science. Kai will outline best practices and lessons learned from this big data project that will benefit mankind aiding the quest for clean energy solutions, bringing electricity to billions around the world, and improving their quality of... Read More.
Mark Troyer is Senior Operations Mechanic at Box, Inc. and has been focusing on analytics and visualization for the Technical Operations team. He holds a Bachelors of Science in Photography from Grand Valley State University but accidentally found a career in server operations working for internet providers and software companies, previously working for ANS, UUNet, Verizon and LeanLogistics.
Tim Tully is Distinguished Architect at Yahoo! and is an experienced big data expert. At Yahoo!, he has designed the Yahoo! Data technology platform, including data warehousing, aggregation, visualization, instrumentation, ETL and anything else involving analytics. Currently, he leads the architecture of multi-petabyte solutions at Yahoo on Hadoop and other big data ecosystems, and is responsible for bringing Spark and Shark to Yahoo. He is also a Winner of prestigious Yahoo! Individual Superstar award for 2011.
Daniel Tunkelang leads LinkedIn’s efforts around query understanding. Before that, he led LinkedIn’s product data science team. He previously led a local search quality team at Google and was a founding employee of Endeca (acquired by Oracle in 2011). He has written a textbook on faceted search, and is a recognized advocate of human-computer interaction and information retrieval (HCIR). He has spoken at three previous Strata conferences, and is on the editorial board of the Journal of Big Data. He has a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.
Milan Vaclavik is Sr. Director and Solution Lead for CenturyLink Technology Solution’s big data solutions. For more than 20 years, he has been bringing innovative software solutions to market in a variety of industries including enterprise messaging and collaboration, digital rights management, document automation, supply chain management, and physical security. He has held senior product management, marketing and business development positions with startup software firms, as well as larger organizations such as Lotus Development/IBM, GE and LexisNexis. Milan holds a bachelors degree in Regional Science from the University of Pennsylvania and an MBA in Finance and Management of the Organization from Columbia Business School.
Jen van der Meer is the Founder and CEO of Reason Street, where she creates business models for social impact. A former Wall Street analyst and economist, Jen is data doyen who masters the emerging edge of technological change. Throughout her career, Jen has practiced an approach that is equal parts data-driven and creative to understand and apply the opportunities for technology to transform the economy, society, and culture. Jen has held executive management roles at Organic, Frog Design, Dachis Group, and Luminary Labs. She is actively engaged in the local startup community in New York City, and is a vocal supporter of the open data movement. She is an Adjunct Professor at NYU ITP, and SVA’s Products of Design. Jen has... Read More.
Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science in Engineering.
He has been working on Hadoop for more than 5 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! bangalore where he worked on HadoopOnDemand, Hadoop-0.20, CapacityScheduler, and Hadoop security, before Hortonworks happened.
He is now neck deep in taking Hadoop’s computing platform to the next level and working on Next generation platform YARN which lets computing frameworks including and besides MapReduce all to be... Read More.
Shivaram Venkataraman is a second year PhD student at the University of California, Berkeley and works with Mike Franklin and Ion Stoica at the AMP Lab. His research
interests are in design of storage systems and analytics platforms for big-data applications. Before coming to Berkeley, he completed his M.S at the University of Illinois, Urbana-Champaign.
Anand Venugopal has been instrumental in building the Big Data Analytics consulting services practice at Impetus Technologies over the last three years. With a diverse 19 year long techno-business background in Telecom, Interactive entertainment and Hi-Tech industry verticals, AV with the rest of the Impetus team, has been helping IT and line-of-business executives in large enterprises understand and extract the enormous value embedded in their static and “in-motion” Big-Data assets.
Dr. Ben Waber is the president and CEO of Sociometric Solutions, a management services firm that uses social-sensing technology to drive innovative transformation services. He is also a visiting scientist at the MIT Media Lab, and he was previously a senior researcher at Harvard Business School. He received his Ph.D. from MIT for his work with Alex “Sandy” Pentland in the Human Dynamics group at the Media Lab. Waber’s work has been featured in major media outlets such as Wired, The Economist, and NPR. He has consulted for industry leaders such as LG, McKinsey & Company, and Gartner on technology trends, social networks, and organizational design. His book People Analytics was released by the Financial Times Press in 2013.
Co-founder and President of Continuum Analytics. Interested in data analysis, scientific computing, and data visualization with Python. Author of the Bokeh web visualization library, and the Chaco interactive visualization toolkit. Extensive experience developing flexible, performant analysis apps and environments across multiple engineering and scientific domains, including finance and high-frequency trading.
Patrick Wendell is a Ph.D student working in the U.C. Berkeley AMPLab. His research focus is on large scale data-intensive computing and his adviser is Ion Stoica. Before working on the BDAS stack at Berkeley, he contributed to several Hadoop projects, mostly while working at Cloudera. He holds a B.S. in Computer Science from Princeton University.
Michael Wendt is a R&D Associate Manager at Accenture Technology Labs in San Jose, CA. Since joining Accenture Technology Labs, Michael has worked with Hadoop, Cassandra, Storm and other Big Data technologies. His research work includes benchmarking bare-metal and cloud-based Hadoop clusters, comparing their price-performance ratio. In addition to his research work on Hadoop, he has advised and helped clients to deploy Hadoop systems and contributed to the design and development a real-time stream processing platform consisting of Storm and Cassandra. Michael has a BS in Computer Engineering from University of Maryland: College Park.
Ben Werther is the Founder & CEO of Platfora. He founded the company in 2011 to realize his vision of how Hadoop and Big Data Analytics will transform the way every business user uses data and move beyond the fiction, feeling and faith that underlies most business decisions.
Under Werther’s direction, Platfora has grown from an idea sketched on a napkin to one of the hottest enterprise startups in Silicon Valley and a leader of the Big Data Analytics category. Platfora’s mission is to empower customers to leverage Big Data Analytics to transform their businesses into Fact-Based Enterprises. Designed for business users, the company’s product is the first visual self-service platform for interactively and iteratively interrogating enormous amounts of data, and masking the complexity... Read More.
Dr. Chris White joined DARPA as a program manager in August 2011. His focus is on developing the enabling technology required for efficiently processing, analyzing and visualizing large volumes of data in a military, mission-oriented context.
Dr. White previously served DARPA as its country lead for Afghanistan and in-theater member of the Senior Executive Service supporting the commander of the NATO International Security Assistance Force, the Combined Joint Staff branch for Intelligence, the Afghan Threat Finance Cell and the regional military commands.
Prior to joining DARPA as government staff, Dr. White was a researcher in DARPA’s Information Innovation Office where he created techniques to better understand, measure and model social media and large networks of information.
Dr. White was a... Read More.
Hadley Wickham is Chief Scientist at RStudio. He is an active member of the R community, has written and contributed to over 30 R packages, and won the John Chambers Award for Statistical Computing for his work developing tools for data reshaping and visualisation. His research focusses on how to make data analysis better, faster and easier, with a particular emphasis on the use of visualisation to better understand data and models.
Leland Wilkinson is currently adjunct professor of computer science at the University of Illinois at Chicago. Previously, he was adjunct professor of statistics at Northwestern University and President of SYSTAT Inc., a statistical software company he founded in 1984. Wilkinson is a fellow of the American Statistical Association, an elected member of the International Statistical Institute, and a fellow of the American Association for the Advancement of Science. He was vice-chair of the board of the National Institute of Statistical Sciences and a member of the Committee on Applied and Theoretical Statistics at the National Academy of Sciences. Wilkinson recently served on the NAS Panel on Developing Science, Technology, and Innovation Indicators for the Future. His projects have included books, journal articles, the... Read More.
Richard has been at the cutting edge of big data since its inception, leading multiple efforts to build multi-petabyte Hadoop platforms, maximizing business value by combining data science with big data. He has extensive experience creating advanced analytic systems using data warehousing and data mining technologies
Ted Willke is a Principal Engineer with Intel and the General Manager of the Graph Analytics Operation in Intel Labs. Before joining Intel Labs in 2010, Ted spent 12 years working on server I/O technologies and standards within Intel’s product and pathfinding organizations. He holds a Doctorate in electrical engineering from Columbia University.
Prior to joining Altiscale, Charles was Site Reliability Engineer at LinkedIn in their web operations team, managing their Apache Traffic Server and HAProxy. Prior to LinkedIn, he was a Principal Service Engineer at Yahoo!, where he ran tens of thousands of nodes in some of the world’s largest Hadoop clusters. He has experience in all aspects of large-scale technical operations—from network to storage to distributed systems. Charles has spoken at Hadoop Summits and Big Data Camps on the subject of Hadoop operations.
Reynold Xin is a PhD student in the AMP Lab at UC Berkeley. He leads the research and development of two open source systems: Shark, an analytical SQL engine that is up to 100X faster than Apache Hive; and SparkGraph, a distributed graph computation engine. He is a recipient of Best Demo Award from SIGMOD 2012 and Best Demo Award from VLDB 2011. Before graduate school, he worked on ads infrastructure at Google and distributed databases at IBM.
Fangjin is one of the first developers to Metamarkets and the Druid project. He mainly works on core infrastructure development. Fangjin comes to Metamarkets from Cisco where he developed diagnostic algorithms for various routers and switches. He holds a BASc in Electrical Engineering and a MASc in Computer Engineering from the University of Waterloo, Canada.
Bin Yu is Chancellor’s Professor in the Departments of Statistics and of Electrical Engineering & Computer Science at the University of California at Berkeley. She was Chair of Department of Statistics from 2009 to 2012.She has published over 80 scientific papers in premier journals in statistics, machine learning, information theory, signal processing, remote sensing, neuroscience, and networks.
She is a Fellow of the American Academy of Arts and Sciences. She was a Guggenheim Fellow in 2006. She is a Fellow of AAAS, IEEE, IMS, and ASA. She is President of IMS (Institute of Mathematical Statistics) and serving on the Scientific Advisory Board (SAB) of IPAM at UCLA and on the Board of Trustees (BOT) of
Matei Zaharia started the Spark project at UC Berkeley and is currently CTO of Databricks. He serves as Spark’s vice president at Apache. In spring 2015, he is also beginning an assistant professor position at MIT.
Alice is the Director of Data Science at GraphLab, a Seattle-based startup offering a powerful large-scale machine learning and graph analytics platform. She loves playing with data and enabling others to play with data. She is a tool builder and an expert in Machine Learning algorithms. Her work spans software diagnosis, computer network security, and social network analysis. Prior to joining GraphLab, she was a researcher at Microsoft Research, Redmond. She holds a Ph.D. and a B.A. in Computer Science, and a B.A. in Mathematics, all from U.C. Berkeley.
For exhibition and sponsorship opportunities, contact Susan Stewart at firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences, email email@example.com
For media-related inquiries, contact Maureen Jennings at firstname.lastname@example.org
View a complete list of Strata contacts