The importance of selection effect


It was the first class of a data science course in the MIDS program. Our instructor started off with a question: “Do you think this academic program gives you a better career?” We gave a bunch of answers, quite the ignorant folks that we were. Of course, is this even a question at this point?! Yes, here are all the new things we’re learning. Yes, the instructors are top-notch. Yes, look at the syllabus and the projects, yada yada … nothing surprising there.

The instructor asked, “Yes, but how do you know that the program is making you better? What if you joined the program because you were motivated enough to apply and work through it, and therefore you’d get better in your career anyway?

That is such an insightful comment. Think about it: you’re coming out of the box and looking at the people in there, and wondering, “Hmm, what kind of people have chosen themselves to be with me today?” Put another way: Would we get the same kind of results if we chose a number of people randomly, and then dumped them together in our class?

Selection Effect

The technical term for this is selection effect. Here’s why it’s interesting to me: Knowing selection effects gives me a sense of what I’m missing, because I’ve now flown out of the box and I’m trying to find patterns in who’s in the box. Second, it lets me think about the biases that could come because of that pattern: both because of who’s in the box, and who’s not.

Examples

Perhaps that was too abstract. I should give examples. No problem – selection effects are everywhere:

  • We already had a casual look at top-tier universities, and wondered whether their selection process virtually guarantees success.
  • Now consider the hot topic of the day: immigration and “the immigrant ethos”. This is a massive selection effect: people who migrate are likely to be highly motivated to get out of their safe zone and find a new home. This can also explain why the effect wears off in their children and grandchildren.
  • Or consider startups vis-a-vis large, established companies: startups tend to attract younger or more ambitious people. Moreover, at different stages in a company’s life cycle, it appeals to different kinds of people.
  • Think about branding or market segmentation: because of the kind of people you select as your customers, you could have a large profit margin (you cater to the rich folks) or a thin one.
  • Selection effects can mean life or death in medicine. Randomized controlled trials (RCTs) are the gold standard today for any new drug. Notice the “randomized” bit there? Pharma companies cannot have selection effects when they try out the drug. If they did, when they actually released the drug to the market, there could be swathes of people with side-effects.
  • Diversity is another key topic today. If you don’t have enough diversity in a committee, the selection effect will cause many people to be under- or un-represented in any decisions it takes. On the other hand, random selection for a jury makes a lot of sense.
  • Time to go a little meta now: What selected you to come to this blog entry today? What kind of people am I selecting with my writing? What does that tell about me, and you?

Conclusion

Selection effect is a powerful idea. It can reveal blind spots in you or your team. By knowing that the ideal is randomization1, you can also think about how off or biased your group is. In contrast, if there is a peer group you’d like to be with, you should look for events or organizations with such selections. In this case, you can use your knowledge of selection effect to your advantage, to go or not to go to certain places.


1 Besides randomization, there are two other core assumptions in causal statistical experiments: excludability and non-interference. Maybe for another day.

See also