Skip to main content

Data Science of Love

Vaclav Petricek (eHarmony)
Data Science Beekman Parlor - Sutton North
Average rating: ****.
(4.45, 11 ratings)
Slides:   1-PDF 

Matchmaking is an age-old concept that has been revolutionalized with the advent of Internet. Suddenly the pool of potential partners that one can plausibly consider has exploded. Thanks to the move of courtship online we have collected unprecedented amounts of data on romantic interactions.

If you are looking for love you may want to take advatage of this accumulated knowledge to give yourself a leg up. However making causal inferences aka “Dating advice” can be problematic due to various sample biases. I will instead show how this data can be leveraged to build a matchmaking system that reduces data overload and improves your chances of a happy marriage.

In this presentation I will focus specifically at solving three problems with data:

  1. Compatibility: matching for the long term based on psychological traits
  2. Affinity: modeling the immediate attraction
  3. Distribution: who to introduce to who and when

I will show how hadoop, vowpal wabbit, gbms, and graph optimization can be used together to solve the matchmaking problem as well as related problems in advertising and constrained recommendation. The presentation will aslo highlight the architecture of eHarmony’s matchmaking system that every night needs to choose the best set of introductions from about 21012 possibilities.

Photo of Vaclav Petricek

Vaclav Petricek

eHarmony

Vaclav Petricek is a Principal Data Scientist at Santa Monica based eHarmony where he is responsible for optimization, and machine learning applications for eHarmony core matchmaking algorithms. He also runs a series of invited ML talks at eHarmony, part of the Los Angeles Machine Learning Meetup. Prior to eHarmony, Vaclav was visiting Researcher at University College London where his research spanned recommender systems, social networks, web structure and online auctions. Prior to that he has worked at several Czech startups as a developer and sysadmin. He earned his PhD in Computer Science from Charles University in Prague as well as his Masters in Distributed Systems.

http://www.linkedin.com/in/petricek

Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, contact Susan Stewart at sstewart@oreilly.com

Media Partner Opportunities

For information on trade opportunities with O'Reilly conferences email mediapartners
@oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Strata + Hadoop World 2013 contacts