Event image

Abstract:

Causality is a fundamental component of learning and inference. While probability is the clear language of choice to speak about uncertainty,  probability is not about control: there is nothing in the axioms of probability about causes and effects. You may say smoking and lung cancer are associated. But how can one say  smoking causes lung cancer? And why does it matter, compared to the question on whether we can predict lung cancer from smoking habits? Surely this is all what, say, a health insurance company would be interested in? What about social media? The content in Anna Smith’s blog might be reflected in Bob Loblaw’s blog among many others. They never cite each other, but the topic are suspiciously similar. And look – aha! – at the time stamps of their posts, adjusted by time zones. Clearly the content in Bob’s blog follows the content in Anna’s. But can “influence” be measured by this association?

Causation is surprisingly hard to define. How to proceed when we cannot even define a concept? Yet, we do care whether smoking causes lung cancer – there is a reason why taxes on cigarettes are what they are. We would not volunteer our time to be interviewed for Anna’s blog if it turns out that she hardly creates any new content: their secret is that she, Bob et al. typically just modify what is popular in Buzz Feed lately. Shame.

It is indeed the case that machine learning and statistics have our back covered in many cases regarding cause-effect questions: randomized controlled trials, reinforcement learning, bandits, A/B testing, you may have heard it all. What you may not have heard of is a bandit algorithm that tells you to stop an innocent pedestrian in the street and force the unlucky person to either pick up smoking or never get close to a cigarette in their life, at the will of a coin toss (and hopefully you understand why this is the case). Also, we typically don’t have the luxury of generating thousands of data points by running a computer game or a robotic action in a lab. What to do? How can machine learning help? And why is it still so hard to hear success stories in this field?