The Eponymous Pickle: Causal Thinking

Showing posts with label Causal Thinking. Show all posts

Wednesday, February 08, 2023

Using Causal Trees

Should always be thinking in this direction. Usefully integrating analysis and reality with process.

Understanding Causal Trees Towards Data Science by Matteo Courthoud / February 03, 2023

How to use regression trees to estimate heterogeneous treatment effects.

In causal inference, we are usually interested in estimating the causal effect of a treatment (a drug, ad, product, …) on an outcome of interest (a disease, firm revenue, customer satisfaction, …). However, knowing that a treatment works on average is often not sufficient and we would like to know for which subjects (patients, users, customers, …) it works better or worse, i.e. we would like to estimate heterogeneous treatment effects.

Estimating heterogeneous treatment effects allows us to use the treatment selectively and more efficiently through targeting. Knowing which customers are more likely to react to a discount allows a company to spend less money by offering fewer but better-targeted discounts. This works also for negative effects: knowing for which patients a certain drug has side effects allows a pharmaceutical company to warn or exclude them from the treatment. There is also a more subtle advantage of estimating heterogeneous treatment effects: knowing for whom a treatment works allow us to better understand how a treatment works. Knowing that the effect of a discount does not depend on the income of its recipient but rather on its buying habits tells us that maybe it is not a matter of money, but rather a matter of attention or loyalty.

In this article, we will explore the estimation of heterogeneous treatment effects using a modified version of regression trees (and forests). From a machine-learning perspective, there are two fundamental differences between causal trees and predictive trees. First of all, the target is the treatment effect, which is an inherently unobservable object. Second, we are interested in doing inference, which means quantifying the uncertainty of our estimates.

Online Discounts and Targeting

For the rest of the article, we are going to use a toy example, for the sake of exposition: suppose we were an online shop and we are interested in understanding whether offering discounts to new customers increases their expenditure. In particular, we would like to know if offering discounts is more effective for some customers with respect to others since we would prefer not to give discounts to customers that would spend anyways. Moreover, it could also be that spamming customers with pop-ups could deter them from buying, having the opposite effect. ... '

Friday, August 26, 2022

Thinking Causal AI

Good thoughts on the topic:

Use Causal AI to Go Beyond Correlation-Based Prediction

Gartner, By Leinar Ramos | August 10, 2022 Intro below

This is a short introduction to a research note we published recently on Causal AI, which is accessible here: Innovation Insight: Causal AI.

Correlation is not causation

“Correlation is not causation” is often mentioned, but rarely given the importance it deserves on AI. Correlations are how we see variables moving together in the data, but these relationships are not always causal.

We can only say that A causes B when an intervention that changes A would also change B as a result (whilst keeping everything else constant). For example, forcing a rooster to crow won’t make the sun rise, even if the two events are correlated.

In other words, correlations are the data we see, whereas causal relationships are the underlying cause-and-effect relationships that generate this data (see image below). Crucially, the data we typically work with exists in a complex web of correlations that obscure the causal relationships we care about.

An image illustrating the distinction between correlations, which are the relationships we directly observe in the data, and causation, which is the underlying set of cause-and-effect relationships that generate the data

Despite their notable success, statistical models, including those in advanced deep learning (DL) systems, use surface-level correlations to make predictions. The current DL paradigm doesn’t drive models to uncover underlying cause-and-effect relationships but simply to maximize predictive accuracy.

Now, it is worth asking: What is the problem of using correlations for prediction? After all, in order to predict, we just need enough predictive power in the data, regardless of whether it comes from causal relationships or statistical correlations. For instance, hearing a rooster crow is useful to predict sunrises.

The core problem lies with the brittleness of the predictions. For correlation-based predictions to remain valid, the process that generated the data needs to remain the same (e.g., the roosters need to keep crowing before sunrise).

There are two fundamental challenges with this correlation-based approach:

Problem #1: We want to intervene in the world

Prediction is rarely the end goal. We often want to intervene in the world to achieve a specific outcome. Anytime we ask a question of the form “How much can we change Y by doing X?”, we are asking a causal question about a potential intervention. An example would be: “What would happen to customer churn if we increased a loyalty incentive?”

And the problem with correlation-based predictive models, like Deep Learning, is that our actions are likely to change the data-generation process and therefore the statistical correlations we see in the data, rendering correlation-based predictions useless to estimate the effect of interventions.

For instance, when we use a churn model (prediction) to decide whether or not to give a customer a loyalty incentive (intervention), the incentive affects the data that generated the prediction (we hope the incentive makes the customer stay). In this case, causality really matters, and we can’t simply use correlations to answer questions on what would happen if we took an action (we need to run controlled experiments or use causal techniques to estimate the effects) .... '

Friday, June 25, 2021

Causal Machine Learning

Another area we experimented with, causal elements in the AI knowledge being used. Would have liked to experiment with Alice.

Microsoft Research Podcast

Econ2: Causal machine learning, data interpretability, and online platform markets featuring Hunt Allcott and Greg Lewis

Published June 2, 2021

Episode 122 | June 2, 2021

In the world of economics, researchers at Microsoft are examining a range of complex systems—from those that impact the technologies we use to those that inform the laws and policies we create—through the lens of a social science that goes beyond the numbers to better understand people and society.

In this episode, Senior Principal Researcher Dr. Hunt Allcott speaks with Microsoft Research New England office mate and Senior Principal Researcher Dr. Greg Lewis. Together, they cover the connection between causal machine learning and economics research, the motivations of buyers and sellers on e-commerce platforms, and how ad targeting and data practices could evolve to foster a more symbiotic relationship between customers and businesses. They also discuss EconML, a Python package for estimating heterogeneous treatment effects that Lewis has worked on as part of the ALICE (Automated Learning and Intelligence for Causation and Economics) project at Microsoft Research. ... "

Monday, December 14, 2020

Quest for More Common Sense: Less Thinking?

Nicely put piece that connects with the current state of the technology. And shows some of the challenges. Have been involved in a number of attempts at including common sense in reasoning, without general success. Back to our need for strong context based reasoning made in the last post. Most succinctly its knowledge and context with a causal engine.

The quest for artificial common sense By Samuel Flender in TowardsDataScience

On July 19th, a blog post titled ‘Feeling unproductive? Maybe you should stop overthinking.’ appeared online. The 1000-word self-help article explains that overthinking is the enemy of our creativity, and advises us to be more in the moment:

“In order to get something done, maybe we need to think less. Seems counter-intuitive, but I believe sometimes our thoughts can get in the way of the creative process. We can work better at times when we ‘tune out’ the external world and focus on what’s in front of us.”

The post was written by GPT-3, Open AI’s massive 175-Billion-parameter neural network trained on nearly half a Trillion words. UC Berkeley student Liam Porr merely wrote the title, and let the algorithm fill in the text. A ‘fun experiment’, to see whether the AI could fool people or not. Indeed, GPT-3 hit a nerve: the post was up-voted to the top of Hacker News.

There’s a paradox, then, with today’s AI. While some of GPT-3’s writings arguably meet the Touring test criterion — convincing people that it is human — it fails spectacularly at the simplest tasks. AI researcher Gary Marcus asked GPT-2, the precursor to GPT-3, to complete the following sentence: ... '

Wednesday, November 11, 2020

Causality is Important for Machine Learning

Have always thought causal thinking and learning was a major consideration for the future of AI in general. Here a step in the right direction. But overall its still a hard question.

Understanding Causality Is the Next Challenge for Machine Learning

Teaching robots to understand "why" could help them transfer their knowledge to other environments By Payal Dhar

“Causality is very important for the next steps of progress of machine learning,” said Yoshua Bengio, a Turing Award-wining scientist known for his work in deep learning, in an interview with IEEE Spectrum in 2019. So far, deep learning has comprised learning from static datasets, which makes AI really good at tasks related to correlations and associations. However, neural nets do not interpret cause-and effect, or why these associations and correlations exist. Nor are they particularly good at tasks that involve imagination, reasoning, and planning. This, in turn, limits AI from being able to generalize their learning and transfer their skills to another related environment.

The lack of generalization is a big problem, says Ossama Ahmed, a master’s student at ETH Zurich who has worked with Bengio’s team to develop a robotic benchmarking tool for causality and transfer learning. “Robots are [often] trained in simulation, and then when you try to deploy [them] in the real world…they usually fail to transfer their learned skills. One of the reasons is that the physical properties of the simulation are quite different from the real world,” says Ahmed. The group’s tool, called CausalWorld, demonstrates that with some of the methods currently available, the generalization capabilities of robots aren’t good enough—at least not to the extent that “we can deploy [them] safely in any arbitrary situation in the real world,” says Ahmed.

The paper on CausalWorld , available as a preprint, describes benchmarks in a simulated robotics manipulation environment using the open-source TriFinger robotics platform. The main purpose of CausalWorld is to accelerate research in causal structure and transfer learning using this simulated environment, where learned skills could potentially be transferred to the real world. Robotic agents can be given tasks that comprise pushing, stacking, placing, and so on, informed by how children have been observed to play with blocks and learn to build complex structures. There is a large set of parameters, such as weight, shape, and appearance of the blocks and the robot itself, on which the user can intervene at any point to evaluate the robot’s generalization capabilities. ... '

Thursday, December 19, 2019

Causal Theory Of Views

Good piece. It is all ultimately about context, in the sense of location, events, agents, results ... connects to the rest of the world. It drives things like goals, value, transparency and risk. We diagrammed these in views that helped us understand their role in decisions.

The Causal Theory of Views
A Conversation with Lee Smolin in The Edge

An event has a view of the world. First, let me tell you what I mean by a view. A view is the information that that event has about how it fits into the rest of the world. That includes who its parents were (by which I mean the events in its past that gave rise to it) and how much energy and momentum was propagated to it from them. If I am an event, my view of the world is what I see when I look around. I see light comes to me from the past, which I perceive as a pattern of colors, which come from photons of different energies striking my eye. That's my view; it's a property of a moment. That contains all that I, as an event, know about how I fit into the rest of the world.

Now, if you know the things that I just said were real—the events, the causal relations, the distribution of energy and momentum flowing—I can tell you what the view of each event is, but I can also flip it around. There's a dual description in which I just say what the views are and that's the whole description. So, I just say there's a view, and that view is that makes a kind of picture. You see the sky, a two-dimensional sphere around you, and there are some colors, which are photons coming in of different energies—that's the view. I can hypothesize that all that exists in the world is views and a process that continually makes new views out of old views. That's what I call the causal theory of views. .... '

LEE SMOLIN, a theoretical physicist, is a founding and senior faculty member at the Perimeter Institute for Theoretical Physics in Canada. His main contributions have been so far to the quantum theory of gravity, to which he has been a co-inventor and major contributor to two major directions, loop quantum gravity and deformed special relativity. He is the author, most recently, of Einstein's Unfinished Revolution. ... Lee Smolin's Edge Bio Page .... "

Monday, October 21, 2019

We Can't Trust Deep Learning Alone

Its roughly the 65th anniversary of the proposal of AI. Time to rethink the broad idea. More comments on a book I have been reading: Rebooting AI: Building Artificial Intelligence we can Trust by Gary Marcus. I am a practitioner in the space, who has built many systems of this type, but remain a proponent of the fact that we must combine Deep Learning with logic processing (or classical) AI.

We used learning in such systems, but it was not deep, but did contain and update knowledge needed to make decisions. How can we make AI both broad and robust? Today we have other ideas that can help us build logical models of things, like Business Process Models and RPA. Minsky's Society of Mind is mentioned as a broad template.

Here interview in Technology Review on the idea:

We can’t trust AI systems built on deep learning alone

Gary Marcus, a leader in the field, discusses how we could achieve general intelligence—and why that might make machines safer. by Karen Hao in MIT Technology Review

Gary Marcus is not impressed by the hype around deep learning. While the NYU professor believes that the technique has played an important role in advancing AI, he also thinks the field’s current overemphasis on it may well lead to its demise. ..."

Finished, I like the thoughts provided. The book sets the stage. Read it. My only disappointment is though the book provides an excellent argument for why, it does not provide a good recommendation of how we should proceed. Always thought there were hints in elements of the context of 'causality' that might help. Now reading Judea Pearl's "The Book of Why: The New Science of Cause and Effect" on that topic.

Thursday, February 28, 2019

Data, Data Science, Causal Thinking

"The Seven Tools of Causal Inference, with Reflections on Machine Learning," by ACM A.M. Turing Award recipient Judea Pearl, describes tools that overcome obstacles to human-level machine intelligence. Pearl delivers a message to machine-learning and AI experts in an original video at bit.ly/2GUEyJW.

Excerpt from the long paper, ultimately positioning our challenge:

Key insights:

- Data Science is a two-body problem, connecting data and reality, including the forces behind the data.

- Data Science is the art of interpreting reality in the light of data, not a mirror through which data sees itself from different angles.

- The ladder of causation is the double helix of causal thinking, defining what can and cannot be learned about actions and about worlds that could have been. ... "

The Eponymous Pickle

About Me

RSS

Blog Archive