Is Covid Crisis Lead to Prosperity - Causal Inference from a Counterfactual World Using Facebook Prophet

Posted June 12, 2021 by Gowri Shankar  ‐  12 min read

Identifying one causal reason is more powerful than identifying dozens of correlational patterns from the data, causal inferencing is a branch of statistics concern to effects that are consequence of actions. In traditional machine learning, we infer from the observations of the past asking how something had happened by characterizing the association between variables. On contrary, causal inferencing addresses why an event had happened through randomized experiments.

Why causality is critical, it has been widely believed causal reasoning is the promising path towards generalization. This is the second post on causality where we earlier attempted to bridge causal reasoning with model explainability under the theme correlation does not imply causation. Please refer

In this post, we are introducing the mathematical intuition behind causality, fairness, explanation and subsequently create a counterfactual world using Facebook Prophet forecasting tool in a retrospective mood. I hope this post will render all necessary instincts to dive deeper in the forthcoming posts on superior causal modeling schemes like DoWhy, Causal Recovery Tool box, Causal DNN etc.

Bull or Bear


It’s a big thing to integrate [causality] into AI. Current approaches to machine learning assume that the 
trained AI system will be applied on the same kind of data as the training data. In real life it is often 
not the case.
- Yoshua Bengio


Lots of people in ML/DL [deep learning] know that causal inference is an important way to
improve generalization.
- Yann LeCun

Objectives

The objective of this post is to formalize the intuition behind causal inferencing and counterfactuals worlds. We also learn about

  • Causal Graphs
  • Causal Explanation and Fairness
  • Facebook Prophet Forecasting Library

Introduction

What makes us human is our ability to rationalize events in terms of cause and effect. i.e The inherent capability to assess why something had happened or have to happen to make a corrective actions and hoping to improve the future outcomes. However, in traditional ML/Deep Learning - learning happens out of correlated features, where features that are available does not completely represent the world. Bridging the gap between known world and unknown world through causal reasoning will realize generalization, a stepping stone for artificail general intelligence.

Performance of traditional machine learning models degrade due to concept and data drift over time, I mean a well tested and deployed model in the field tend to go obsolete. Continuous development and integration schemes address the data drift problem but the robustness of the model degrades eventually. On contrary, causal inferencing focus on what might have happened when there is lack of information through randomized control trials(RCTs) or A/B tests. It also addresses the classic problem of Out of Distribution to certain extent.

Counterfactual World Conundrum

Purely observational data does not account the attributes of the counterfactual world. i.e certain actions are taken and the effect of those actions are recorded in the dataset, in case those actions are not taken in an alternate universe - What could have been the effect? Let us ilustrate this

$X$ causes $Y$ iff $X$ leads to a change in $Y$, i.e. $(X \rightarrow Y)$ - keeping everything else constant $$then$$ $$\Large \text{Causal Effect} = \overbrace{ E[ Y|do(X=1)]}^{\text{ Real World }} - \underbrace{ E[Y|do(X=0)] }_{\text{Counterfactual World}} \tag{1. Causal Effect}$$ Causal effect is the magnitude by which $Y$ is changed by a unit change in $X$.

Causal Effect
This illustration renders the change in NFTY 50 Index when the covid crisis started during Feb, 2020 and lockdown effect by Mar/Apr, 2020 - We are studying this dataset in the subsequent section in depth.

For example, In a farm field everything is automated using an expert system that is powered by sensors and actuators.

  • Moisture level of the soil is fed to the expert system through sensors
  • If the soil moisture level goes below certain threshold, actuators triggers to kick start the water supply
  • Atmospheric temperature is monitored and recorded
  • Subsequently, moisture level of the soil also recorded

Sensors ensured the farm field is watered as and when the temperature goes up which inherently lead to maintaining the moisture of the soil in acceptable level. The data of temperature and soil moisture levels are collected and they are quite correlated.

This observational data is fed into an AI system which is about to augment the expert system based on historical observations. However the AI system ended up recommending No Watering when there is High Temperature because the expert system ensured watering whenever temperature peaks, resulting in high temperature leads to high moisture level.

In causal inferencing, empahsis is given for the counterfactual world where an intervention did not happen ie $E[Y|do(X=0)]$ and it’s effects are measured.

Prediction Accuracy and Modeling through Causal Graphs

From data, we can build models and train them for a better prediction metrics(e.g. Loss, Accuracy etc) but data alone is not sufficient for causal inferencing. It has to be augmented with domain knowledge and assumptions. Further, we cannot calculate the causal effect because we cannot observe the counterfactual world - we can only estimate.

Meanwhile, without a loss function we cannot build a predictive model. For a traditional machine learning model the loss function is

$$min \sum_{(x, y)} loss(h(x), y) \tag{2. Correlational ML}$$ then through Causal Learning

  • Identify which features directly cause the outcome
  • Build a predictive model using only those features

$$min \sum_{(x_c, y)} loss(h(x_c), y) \tag{3. Causal ML}$$

Where,

  • $x_c$ is subset of the features $x$
  • $h(.)$ is the predicted value

There are fundamental challenges to address to infer causal effect, let us start with the following 4 steps suggested by Emre Kiciman and Amit Sharma of Microsoft Research.

  • Modeling: Create a causal graph to encode assumptions
  • Identification: Formulate what to estimate
  • Estimation: Compute the estimation
  • Refutation: Validate the assumptions

Modeling through Causal Graphs
Converting domain knowledge into a formal causal assumptions and relationships between outcome and variables is the key aspect of causal modeling.

For example, the below relationship says the following

  • $A \rightarrow C$, $B \rightarrow C$, $C \rightarrow D$, $D \rightarrow E$, $D \rightarrow F$
  • $A \nleftrightarrow B$, $D \nleftrightarrow A | C$, $E \nleftrightarrow C | D$, $\cdots$
    Causal Graph

It is quite impossible to infer this graph from a dataset alone and assumptions are encoded by missing edges and direction of the edges.

We spoke about Out of Distribution (OOD) using inductive biases several times in the quest of achieving artificial general intelligence. Wrt causal models, it has been observed, OOD errors are lower than correlated models. This is due to $P(Y|X_c)$ is invariant across different distributions, unless there is a change in true data-generating process for $Y$.

Explanation and Fairness

Causal reasoning provides meaninful definitions of machine learning explanation and fairness to achieve a responsible and trustworthy models.

Counterfactual explanation: Given a current prediction of a ML model, what features should be changed to flip the model’s outcome.
For e.g. Model predicts admission: REJECT for a student who applied for Ivy league schools based on the SAT scores. A counterfactual explantion would give, what SAT score is expected to change the outcome to admission: ACCEPT.

Counterfactual fairness: If the model provides counterfactual explanations only using sensitive attributes like ethnicity, gender, color etc of a person - We can infer model is biased.


An ML model is fair if the probability of its output remains invariant to any changes in the sensitive
attribute, ekkping all non-descendents of the sensitive attribute fixed.
- Kusner et al. 2017

For e.g. Model predicts admission: REJECT for a student and admission: ACCEPT for the student when the color attribute is changed, then the model is biased.

Forecasting the Counterfactual World

Disclaimer: This experiment is not possible in the real world. Our intention is to understand $eq.1$ by creating a counterfactual world where there is no Covid using Facebook Prophet forecaster.

In this section, we shall take financial index of NIFTY 50 from Yahoo Finance and understand causality in the context of Covid Lockdowns. By doing a simple exploratory anaylysis, we identify the sudden change in the index and assuming it is due to covid. i.e. around mid of Feb, 2020 - It has been noticed the pandemic awareness was spreading among the people and markets across the world started reacting. This reaction lead to downward trend in the NIFTY 50 Index.

Real World: We have the real world data
Counterfactual World: We shall forecast the counterfactual world NIFTY Index using Facebook Prophet.

NFTY 50

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import random
import seaborn as sns
from fbprophet import Prophet
from datetime import datetime

pd.options.mode.chained_assignment = None  # default='warn'
nfty_50_df = pd.read_csv("./nse_50_yf.csv", parse_dates=['Date'])
nfty_50_df.head()

DateOpenHighLowCloseAdj CloseVolume
02016-03-047505.3999027505.8999027444.1000987485.3500987485.350098281700.0
12016-03-087486.3999027527.1499027442.1499027485.2998057485.299805257000.0
22016-03-097436.1000987539.0000007424.2998057531.7998057531.799805245100.0
32016-03-107545.3500987547.1000987447.3999027486.1499027486.149902224700.0
42016-03-117484.8500987543.9501957460.6000987510.2001957510.200195198700.0
nfty_50_df = nfty_50_df.sort_values("Date")
plt.figure(figsize=(15, 5))
plt.style.use('seaborn')
plt.plot(nfty_50_df.Date, nfty_50_df.Close)
plt.xlabel("Date")
plt.ylabel("NFTY Index")
plt.title("NFTY 50, 2016 - 2021")
Text(0.5, 1.0, 'NFTY 50, 2016 - 2021')

png

Data Preprocessing and Exploration

From the data, we identified the market started reacting to the covid crisis sharply 16th Feb 2020 onwards. Covid news is a confounding variable that the market was not aware during that time. What we are doing is travel backward in the time dimension by splitting the data into no-crisis and crisis days and label them.

CRISIS_CUTOFF = datetime.strptime('2020-02-16', '%Y-%m-%d')
DAY_OF_DELUGE = datetime.strptime('2020-03-25', '%Y-%m-%d')
crisis_df = nfty_50_df[nfty_50_df.Date >= CRISIS_CUTOFF]
crisis_df.loc[:, "CovidCrisis"] = True
no_crisis_df = nfty_50_df[nfty_50_df.Date < CRISIS_CUTOFF]
no_crisis_df.loc[:, "CovidCrisis"] = False
nfty_50_df = pd.concat([no_crisis_df, crisis_df])
no_crisis_df.head()

DateOpenHighLowCloseAdj CloseVolumeCovidCrisis
02016-03-047505.3999027505.8999027444.1000987485.3500987485.350098281700.0False
12016-03-087486.3999027527.1499027442.1499027485.2998057485.299805257000.0False
22016-03-097436.1000987539.0000007424.2998057531.7998057531.799805245100.0False
32016-03-107545.3500987547.1000987447.3999027486.1499027486.149902224700.0False
42016-03-117484.8500987543.9501957460.6000987510.2001957510.200195198700.0False

Facebook Prophet Forecasting

Let us use the sophisticated forecasting tool Prophet from Facebook to predict the market dynamics of the counterfactual world using the data from No Covid Crisis period. We are feeding 4 years of historical data(i.e. from 2016 - 2020) to predict the period of Covid Crisis(i.e. March 2020 - June 2020)

forecast_df = no_crisis_df[["Date", "Close"]]
forecast_df = forecast_df.rename(columns = {
    'Date': 'ds', "Close": 'y'
})

m = Prophet()
m.fit(forecast_df)
plt.figure(figsize=(20, 5))
future = m.make_future_dataframe(periods=len(crisis_df))
forecast = m.predict(future)
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.



<Figure size 1440x360 with 0 Axes>
figure = m.plot(forecast, xlabel='Date', ylabel='Close', figsize=(15, 5))

png

Compare the forecast with actuals. Counterfactual world is quite stable and predictable with low or insignificant volatility.

nfty_50_df["Forecast"] = forecast["yhat"]
plt.figure(figsize=(15, 5))
sns.lineplot(data=nfty_50_df, x="Date", y="Close", label="Real World")
sns.lineplot(data=nfty_50_df, x="Date", y="Forecast", label="Counterfactual World")
plt.title("Real World vs Counterfactual World - NFTY 50")
Text(0.5, 1.0, 'Real World vs Counterfactual World - NFTY 50')

png

Is Covid Crisis Lead to Wealth Creation?

To answer our question, we shall turn to our causal effect equation $\overbrace{ E[ Y|do(X=1)]}^{\text{ Covid Crisis }} - \underbrace{ E[Y|do(X=0)] }_{\text{No Covid}}$ by accomplishing following steps

  • Removing the null values due to holidays or data not recorded days
  • Normalize the forecasted and the actual NFTY 50 index
  • Calculate the causal effect using $eqn.1$
crisis_df = nfty_50_df[nfty_50_df["Date"] >= CRISIS_CUTOFF]
plt.figure(figsize=(20, 5))
sns.lineplot(data=crisis_df, x="Date", y="Close", label="Real World")
sns.lineplot(data=crisis_df, x="Date", y="Forecast", label="Counterfactual World")

plt.title("Real World vs Counterfactual World - NFTY 50")
Text(0.5, 1.0, 'Real World vs Counterfactual World')

png

crisis_df.isnull().sum()
crisis_df.dropna(inplace=True)
crisis_df.isnull().sum()
Date           0
Open           0
High           0
Low            0
Close          0
Adj Close      0
Volume         0
CovidCrisis    0
Forecast       0
dtype: int64
y_hat = crisis_df["Forecast"].values
normalized_y_hat = (y_hat - np.mean(y_hat)) / np.std(y_hat)

y = crisis_df["Close"].values
normalized_y = (y - np.mean(y)) / np.std(y)
causal_effect = np.sum(normalized_y - normalized_y_hat)
causal_effect, np.allclose(causal_effect, 0)
(-1.0516032489249483e-12, True)

Our calculated causal effect is -0.00000000000010516… infers there is no evidence of wealth creation during the crisis period and also there are no significant sign of losses. This process can be reused as a retrospective scheme to find out the short/mid-term gains/losses.

Annexe: Exploratory Data Analysis

The Great Fall

Due to pandemic awareness and lockdown from mid of Feb, 2020.

deluge_df = nfty_50_df[(nfty_50_df["Date"] >= datetime.strptime('2020-01-01', '%Y-%m-%d')) & (nfty_50_df["Date"] < datetime.strptime('2020-07-01', '%Y-%m-%d'))]
plt.figure(figsize=(15, 10))
sns.lineplot(data=deluge_df, x="Date", y="Close", label="Real World")
sns.lineplot(data=deluge_df, x="Date", y="Forecast", label="Counterfactual World")

plt.title("Real World vs Counterfactual World")
Text(0.5, 1.0, 'Real World vs Counterfactual World')

png

Right Tail in Distribution

Density plot gives us an idea about the market dynamics, the observations are

  • Until the crisis began, Index seldom crossed 13000 mark
  • There are really bad days like 24th/25th March 2020 where it dropped down to ~8000 mark
  • However, as the awareness of the pandemic spreads, market surge is observed
  • A right long tail of notable dense trading appeared concetrating at 15000 mark
  • On the covid crisis period, a bimodal density distribution is observed with almost similar peaks.
# Plot distribution of the average price
plt.figure(figsize=(15, 5))
sns.distplot(nfty_50_df.Close, color='b')
/Users/shankar/dev/tools/anaconda3/envs/prophet/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)





<matplotlib.axes._subplots.AxesSubplot at 0x7fbc1b1d78b0>

png

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
sns.distplot(ax = ax1, x=no_crisis_df.Close, color='b')
sns.distplot(ax = ax2, x=crisis_df.Close, color='b')
/Users/shankar/dev/tools/anaconda3/envs/prophet/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)
/Users/shankar/dev/tools/anaconda3/envs/prophet/lib/python3.8/site-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
  warnings.warn(msg, FutureWarning)





<matplotlib.axes._subplots.AxesSubplot at 0x7fbc1b126d60>

png

High Volatility

Violin plot shows the fluctuation in average index, where crisis periods are quite volatile but the area remains almost same - Again, correlation does not imply causation. i.e. Covid crisis may not have an impact on overall wealth creation. Is it clever to say, anything that goes up has to come down and vice versa… covid has no impact and life moves on. However, historical studies points every crisis is followed by a period of prosperity.

plt.figure(figsize=(15, 5))
sns.violinplot(y='Close', x='CovidCrisis', data=nfty_50_df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fbc1b33a9d0>

png

Inference

Causal inferencing is challenging process, It is one of the key research areas in machine learning where there are schemes suggesting causality based deep neural network architectures. In this post, we explored causality at a very high level and simulated a counterfactual world using Facebook Prophet. We also found evidences of no prosperity during the crisis period. In the future posts, we shall explore sophisticated causal modelling framworks like DiCE, CDT etc..

Reference