Entities, Relationships, and Semantics: the State of Structured Search

Moderated by:
Daniel Tunkelang (LinkedIn)
Panelists:
Andrew Hogue (Foursquare), Breck Baldwin (Alias-i), Evan Sandhaus (New York Times), Wlodek Zadrozny (IBM)
Data
Location: Sutton South
Average rating: ****.
(4.00, 1 rating)

Structured search improves the search experience through the identification of entities and their relationships in documents and queries. This panel will explore the current state of structured and semi-structured search, as well as exploring the open problems in an area that promises to revolutionize information seeking.

The four panelists below work on some of the world’s largest structured search problems, from offering users structured search on Google’s web corpus to building a computing system that defeated Jeopardy! champions in an extreme test of natural language understanding. They work on the data, tools, and research that are driving this field. They are all excellent researchers and presenters, promising to offer a informative and engaging panel discussion, for which I will act as moderator.

Panelists:

  • Andrew Hogue is a Senior Staff Engineer and Engineering Manager in the Search Quality group at Google New York. He has worked on a wide array of projects including question answering, Google Squared, sentiment analysis, local and product search, and Google Goggles. His is interested in the areas of structured data, information extraction, and machine learning, and their applications to search and search interfaces. Prior to Google, he earned a M.Eng. and B.S. in Computer Science from MIT.
  • Breck Baldwin is the President of Alias-i, creators of the popular LingPipe computational linguistics toolkit. He received his Ph.D. in computer science in 1995 from the University of Pennsylvania. In the time between his thesis on coreference resolution and evaluation and founding Alias-i in 1999, Breck worked on DARPA-funded projects through the University of Pennsylvania.
  • Evan Sandhaus works as the Semantic Technologist in The New York Times Research and Development Labs. He is spearheading The New York Times Linked Open Data Strategy and overseeing the release of 1.8 million documents to the computer science research community. Previously, Evan helped to put The New York Times on Google Earth, collaborated with New York University to explore new directions in News Search, and worked to bring The New York Times to Facebook.
  • Wlodek Zadrozny is an IBM Researcher working on natural language applications. Most recently he worked on text sources for Watson (IBM’s Jeopardy chamption) and applying related DeepQA technology to business problems. His previous work ranged from language processing research to product development and technical planning; in particular, he lead the development of interactions systems that used speech, natural language and focused search. Wlodek Zadrozny received a Ph.D. in Mathematics, from the Polish Academy of Science.

Moderator:

Daniel Tunkelang oversees the data science team at LinkedIn, which analyzes terabytes of data to produce products and insights that serve LinkedIn’s members. Prior to LinkedIn, Daniel led a local search quality team at Google. Daniel was a founding employee and Chief Scientist of Endeca, a leader in enterprise search and business intelligence that pioneered the use of guided navigation in search applications. He has authored eight patents, written a textbook on faceted search, created the annual workshop on human-computer interaction and information retrieval (HCIR), and participated in the premier research conferences on information retrieval, knowledge management, databases, and data mining (SIGIR, CIKM, SIGMOD, SIAM Data Mining). Daniel holds a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.

Photo of Daniel Tunkelang

Daniel Tunkelang

LinkedIn

I am a Principal Data Scientist at LinkedIn. I’ve also worked at Google on local search quality, and I was a founding employee of faceted search pioneer Endeca, where I spent ten years as Chief Scientist.

I am a leading industry advocate of human-computer information retrieval (HCIR). I established the HCIR workshop, which has taken place annually since 2007. More generally, I’ve worked to bring together researchers and practitioners in this area, organizing academic and industry events related to search and social media. I wrote the first book on faceted search as part of the Morgan & Claypool Synthesis Lectures.

I hold degrees in math and computer science from MIT and a PhD in computer science from CMU, where I worked on network visualization. I blog at The Noisy Channel.

Photo of Andrew Hogue

Andrew Hogue

Foursquare

Andrew Hogue is a Senior Staff Engineer and Engineering Manager in the Search Quality group at Google New York. He has worked on a wide array of projects including question answering, Google Squared, sentiment analysis, local and product search, and Google Goggles. His is interested in the areas of structured data, information extraction, and machine learning, and their applications to search and search interfaces. Prior to Google, he earned a M.Eng. and B.S. in Computer Science from MIT.

Breck Baldwin

Alias-i

Breck Baldwin is the President of Alias-i, creators of the popular LingPipe computational linguistics toolkit. He received his Ph.D. in computer science in 1995 from the University of Pennsylvania. In the time between his thesis on coreference resolution and evaluation and founding Alias-i in 1999, Breck worked on DARPA-funded projects through UPenn.

Evan Sandhaus

New York Times

Evan Sandhaus works as the Semantic Technologist in The New York Times Research and Development Labs. He is spearheading The New York Times Linked Open Data Strategy and overseeing the release of 1.8 million documents to the computer science research community. Previously, Evan helped to put The New York Times on Google Earth, collaborated with New York University to explore new directions in News Search, and worked to bring The New York Times to Facebook.

Wlodek Zadrozny

IBM

Wlodek Zadrozny is an IBM Researcher working on natural language applications. Most recently he worked on text sources for Watson (IBM’s Jeopardy chamption) and applying related DeepQA technology to business problems. His previous work ranged from language processing research to product development and technical planning; in particular, he lead the development of interactions systems that used speech, natural language and focused search. Wlodek Zadrozny received a Ph.D. in Mathematics, from the Polish Academy of Science.

Sponsors

  • Aster Data
  • EMC Greenplum
  • GE
  • Lexis Nexis
  • MarkLogic
  • Tableau Software
  • Cloudera
  • DataStax
  • Informatica
  • DataSift
  • Splunk
  • Amazon Web Services
  • Datameer
  • Impetus
  • Karmasphere
  • MapR Technologies
  • Pervasive
  • Platform Computing
  • Revolution Analytics
  • Sybase
  • Xeround
  • Media-Science
  • Platfora

Sponsorship Opportunities

For information on sponsorship opportunities at the conference, contact Susan Stewart at sstewart@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata Contacts