©2011, O'Reilly Media, Inc.
(800) 889-8969 or (707) 827-7019
Monday-Friday 7:30am-5pm PT
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
Economists utilize a data analysis toolkit and intuition that can be very helpful to Data Scientists. In particular, econometric methods are quite useful in disentangling correlation and causation, a use case not well-handled by standard machine learning and statistical techniques. This session will cover examples of econometric methods in action, as well as other economics-related insights. Think of it as a crash-course in basic econometric intuition that one receives during a PhD in Economics (I received my PhD from Stanford in 2008).
Why econometrics? The difference between econometrics and statistics is that statistical modeling is more concerned with fit, and econometric modeling is more concerned with properly estimating the coefficients in a regression. Getting the “right” (consistent & unbiased) estimates means that the analyst can more effectively measure how a change in one variable can strongly predict (or cause) a change in the dependent variable. These techniques can help solve problems in social/web data that previously were only solvable using future data collection from randomized multivariate experiments.
To do this, the analyst first develops an intuition for whether or not there is a source of “endogeneity” in the regression. This largely is determined by the relationship between the predictors and the error term in the regression. Once the source of the endogeneity is understood, econometric techniques like fixed/random effects and instrumental variables can be quite useful. The type of data that is collected and available is key to the extent to which the power of these techniques can be used. [I might also go into some other techniques, but these are the most useful]
The methods will be presented in a way so that a non-technical person can understand the basic intuition, and also so that a practitioner can apply the methods in the future. Examples will be provided. For panel data econometrics, we will discuss the example of how to identify actions taken early on by a LinkedIn member that are predictive of their future engagement with the product, a problem that is difficult due to the confounding of correlation and causation. For instrumental variables techniques, we will discuss how to use random variation in the weather to say cool things about politics, economics, and web usage.
In addition to the discussion of applied econometric techniques, there may also be time for economics-related data insights. Currently we are developing unemployment rate prediction models using time-series econometrics as well as indexes to measure changes in the supply/demand for talent across regions and industries.
Scott Nicholson is Chief Data Scientist at Accretive Health and has a PhD in Economics from Stanford. His team works to find innovative uses of data to help doctors make better decisions about care and to help people be more proactive about their own care. Before moving into the health care industry Scott was a Team Lead, Senior Data Scientist and Economist at LinkedIn where his work focused on using econometric techniques to increase user engagement and identify economically relevant insights from the rich LinkedIn data. Before LinkedIn Scott built real-time bidding and ad selection algorithms at an online advertising startup.
For information on sponsorship opportunities at the conference, contact Susan Young at firstname.lastname@example.org
For media-related inquiries, contact Maureen Jennings at email@example.com
View a complete list of Strata Contacts