Collborative filtering is a method of making predictions about a user’s interests based on the preferences of many other users. It’s used to make recommendations on many Internet sites, including LinkedIn. For instance, there’s a “Viewers of this profile also viewed” module on a user’s profile that shows other covisited pages. This “wisdom of the crowd” recommendation platform, built atop Hadoop, exists across many entities on LinkedIn, including jobs, companies, etc., and is a significant driver of engagement.
During this talk, I will build a complete, scalable item-to-item collaborative filtering MapReduce flow in front of the audience. We’ll then get into some performance optimizations, model improvements, and practical considerations: a few simple tweaks can result in an order of magnitude performance improvement and a substantial increase in clickthroughs from the naive approach. This simple covisitation method gets us more than 80% of the way to the more sophisticated algorithms we have tried.
This is a practical talk that is accessible to all.
Sam Shah is a Principle Engineer in the Search, Network, and Analytics Team at LinkedIn, working on applied data products. He is the principal person behind “People You May Know,” LinkedIn’s people recommendation service, and LinkedIn’s collaborative filtering system. Sam holds a Ph.D. in Computer Science from the University of Michigan.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Susan Stewart at firstname.lastname@example.org.
For information on trade opportunities with O'Reilly conferences contact Kathy Yu at mediapartners
For media-related inquiries, contact Maureen Jennings at email@example.com
View a complete list of Strata contacts