Machine learning course for VIVE
  • Home

Exercise set 8: Heterogeneous treatment effectts

In this exercise set we will be working with estimation of conditional average treatment effects assuming selection on observables.

First we will use the econml package in Python to estimate a double machine learning causal forest, and in the second part we will use the grf package in R to estimate a causal forest.

In this exercise we will be using data from LaLonde, R. J. (1986). Evaluating the econometric evaluations of training programs with experimental data. The American economic review, 604-620, regarding a job training field experiment, where we will examine possible treatment effect heterogeneity treatment effects, downloaded from NYU but supplied to you in csv format in a sligthly cleaned format. The object of interest is real earnings in 1978 and we assume selection on observables and overlap.

Python

In this first part of the exercise, we will be utilizing Python and econml.

First we load some packages and the data.

import matplotlib.pyplot as plt
import numpy as np 
import pandas as pd 
import seaborn as sns

np.random.seed(73)

plt.style.use('seaborn-whitegrid')

%matplotlib inline

df = pd.read_csv('nsw.csv', index_col=0)

df.describe()
treat age education black hispanic married nodegree re75 re78
count 705.000000 705.000000 705.000000 705.000000 705.000000 705.000000 705.000000 705.000000 705.000000
mean 0.415603 24.607092 10.272340 0.795745 0.107801 0.164539 0.774468 3116.271386 5586.166074
std 0.493176 6.666376 1.720412 0.403443 0.310350 0.371027 0.418229 5104.566478 6269.582709
min 0.000000 17.000000 3.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% 0.000000 19.000000 9.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000
50% 0.000000 23.000000 10.000000 1.000000 0.000000 0.000000 1.000000 1122.621000 4159.919000
75% 1.000000 27.000000 11.000000 1.000000 0.000000 0.000000 1.000000 4118.681000 8881.665000
max 1.000000 55.000000 16.000000 1.000000 1.000000 1.000000 1.000000 37431.660000 60307.930000

Exercise 1.1

Subset the treatment, outcome and covariates from the DataFrame as T, y and X, respectively

Hints:

A DataFrame supports method .drop(), if one wishes to drop multiple columns at once.

# Your answer here
### BEGIN SOLUTION

T = df.treat
y = df.re78
X = df.drop([ 'treat', 're78'], axis='columns')

### END SOLUTION

Exercise 1.2

Estimate a CausalForest using the econml package. Make sure that you use the splitting criterion as in the generalized random forest, but otherwise use default parameters.

Hints:

The documentation for the CausalForest can be found here

The splitting criterion is handled by the criterion parameter

# Your answer here
### BEGIN SOLUTION

from econml.grf import CausalForest
cf = CausalForest(criterion='het')
cf.fit(X, T, y)

### END SOLUTION
CausalForest(criterion='het')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
CausalForest(criterion='het')

Exercise 1.3

Predict out of bag estimates of the CATE, and create a histogram of these. Do you observe heterogeneity?

Hints:

The documentation for the CausalForest can be found here

Out of bag is sometimes shortened oob

You can create a histogram using sns.histplot

# Your answer here
### BEGIN SOLUTION

tau_cf = cf.oob_predict(X)

sns.histplot(tau_cf, legend=False)
plt.show()

### END SOLUTION

Exercise 1.4

Estimate a CausalForestDML using the econml package. Make sure that you 1) use the splitting criterion as in the generalized random forest, 2) interpret the treatment as a discrete treatment and 3) estimate a thousand trees, but otherwise use default parameters.

Hints:

The documentation for the CausalForestDML can be found here

# Your code here
### BEGIN SOLUTION

from econml.dml import CausalForestDML

cf_dml = CausalForestDML(
                        discrete_treatment=True,
                        n_estimators=1000,
                        criterion='het'
                        )
                        
cf_dml.fit(Y=y, T=T, X=X)

### END SOLUTION
<econml.dml.causal_forest.CausalForestDML at 0x1ff165a1c30>

Exercise 1.5

Report the doubly robust average treatment effect

Hints:

The documentation for the CausalForestDML can be found here

A function which summarizes the model findings is available

# Your code here
### BEGIN SOLUTION

cf_dml.summary()

### END SOLUTION
Population summary results are available only if `cache_values=True` at fit time!
Doubly Robust ATE on Training Data Results
point_estimate stderr zstat pvalue ci_lower ci_upper
ATE 805.679 487.542 1.653 0.098 -149.886 1761.244
Doubly Robust ATT(T=0) on Training Data Results
point_estimate stderr zstat pvalue ci_lower ci_upper
ATT 808.062 473.896 1.705 0.088 -120.757 1736.881
Doubly Robust ATT(T=1) on Training Data Results
point_estimate stderr zstat pvalue ci_lower ci_upper
ATT 802.328 965.46 0.831 0.406 -1089.938 2694.594

Exercise 1.6

Examine what variables drive the heterogeneity using the split based feature importance method.

Hints:

The documentation for the CausalForestDML can be found here

The feature importances and feature names are available through a method

# Your code here
### BEGIN SOLUTION

# Variable name and importance, printed and unsorted 
for name, importance in zip (cf_dml.cate_feature_names(), cf_dml.feature_importances()):
    print(f"{name: <10} {importance:.2}")

# Sorted and plotted
importances = pd.Series(cf_dml.feature_importances(),index = X.columns).sort_values(ascending=False)
importances.plot.bar()
plt.show()

### END SOLUTION
age        0.34
education  0.18
black      0.027
hispanic   0.013
married    0.046
nodegree   0.018
re75       0.37

Exercise 1.7

Calculate the SHAP values for the model.

Hints:

There’s an example on the following page

# Your code here
### BEGIN SOLUTION

shap_values = cf_dml.shap_values(X)

### END SOLUTION
 97%|=================== | 681/705 [00:42<00:01]       

Exercise 1.8

Create a bar plot of the feature importance using SHAP values. Is the top variable the same as in the split based one?

Hints:

There’s an example on the following page

The summary plot has a plot_type parameter

# Your code here
### BEGIN SOLUTION

import shap

unnested_shap_values = shap_values['re78']['treat_1']

shap.summary_plot(unnested_shap_values, plot_type="bar")

### END SOLUTION

Exercise 1.9

Create a summary plot of the feature importance using SHAP values. In what way does the top variable moderate the CATE?

Hints:

There’s an example on the following page

The feature value colourbar might disappear, in which the following to lines of code might help:

plt.gcf().axes[-1].set_aspect(100)

plt.gcf().axes[-1].set_box_aspect(100)

# Your code here
### BEGIN SOLUTION

shap.summary_plot(unnested_shap_values, show=False)
plt.gcf().axes[-1].set_aspect(100)
plt.gcf().axes[-1].set_box_aspect(100)

### END SOLUTION

Exercise 1.10

Create a shallow decision tree of depth 2 to explain the CATE as a function of the inputs using SingleTreeCateInterpreter. Does the tree split on the variables you expected it to?

Hints:

There’s an example on the following page

# Your code here
### BEGIN SOLUTION

from econml.cate_interpreter import SingleTreeCateInterpreter
%matplotlib inline
treeCATE = SingleTreeCateInterpreter(max_depth=2)
treeCATE.interpret(cf_dml, X)

fig, ax = plt.subplots(1, figsize=(20,10))
    
treeCATE.plot(ax=ax, fontsize= 16, feature_names=X.columns)

### END SOLUTION
Criterion 'mse' was deprecated in v1.0 and will be removed in version 1.2. Use `criterion='squared_error'` which is equivalent.

Exercise 1.11

Create a shallow policy tree of depth 2 to explain the CATE as a function of the inputs using SingleTreePolicyInterpreter. Who does the model target?

Hints:

There’s an example on the following page

# Your code here
### BEGIN SOLUTION

from econml.cate_interpreter import SingleTreePolicyInterpreter
treePolicy = SingleTreePolicyInterpreter(max_depth=2)
treePolicy.interpret(cf_dml, X)

fig, ax = plt.subplots(1, figsize=(20,10))
    
treePolicy.plot(ax=ax, fontsize= 16, feature_names=X.columns)

### END SOLUTION

R

In the second part of the exercise, we will be utilizing Python and grf. I will supply R code in code that has been commented out (# for code, ## for comments), and they will not be able to run in Python if you comment them in again. One can run R code in Python using rpy2, but I generally recommend using R and not rpy2 due to the complexity of problem solving if rpy2 fails.

The many different functions available in grf can be seen in their reference and they have a lot of great tutorials, accessible at the top of their page.

First we load the data.

#library(ggplot2)
#library(grf)

## Load data
# df = read.csv("nsw.csv")

## Subset target, treatment and covariates
#Y = df$re78
#W = df$treat
#X = subset(df, select= c(-treat, -re78, -X))

Exercise 2.1

Estimate a causal_forest using grf.

Hints:

There’s documentation for causal_forest can be found here

### BEGIN SOLUTION

#cf = causal_forest(X, Y, W)

### END SOLUTION

Exercise 2.2

Assess the overlap assumption by creating a histogram of the estimated treatment propensities

Hints:

An example can be seen here

### BEGIN SOLUTION

## Treatment propensities
#e.hat = cf$W.hat

## Plot
#hist(e.hat)

## Bounded away from 1 and 0, looks good

### END SOLUTION

Exercise 2.3

Estimate the doubly robust average treatment effect. Do we find the same as previously?

Hints:

Can you find a function for average treatment effect estimation in the reference?

### BEGIN SOLUTION

#average_treatment_effect(cf)

## Output
## estimate  std.err 
## 782.0347 508.0461 

# Approximately the same as before -- good!

### END SOLUTION

Exercise 2.4

Estimate out of bag CATE’s

Hints:

How to predict with a forest can be seen here

### BEGIN SOLUTION

## Calculate out-of-bag CATE
#tau.hat = predict(cf)$predictions

### END SOLUTION

Exercise 2.5

Test whether heterogeneity exists in the CATE’s using the median split based test.

Hints:

An example can be seen here

### BEGIN SOLUTION

## Split on median
#high.effect = tau.hat > median(tau.hat)

## Calculate average treatment effects
#ate.high = average_treatment_effect(cf, subset = high.effect)
#ate.low = average_treatment_effect(cf, subset = !high.effect)

## Calculate difference and CI
#CATE_difference = ate.high[["estimate"]] - ate.low[["estimate"]] 
#CI = ate.high[["estimate"]] - ate.low[["estimate"]] +
#  c(-1, 1) * qnorm(0.975) * sqrt(ate.high[["std.err"]]^2 + ate.low[["std.err"]]^2)

## Report
#print(CATE_difference)
#print(CI)

## [1] 669.8228
## [1] -1340.735  2680.381

### END SOLUTION

Exercise 2.6

Test whether heterogeneity exists in the CATE’s using the RATE. NOTE: Here we should do out of sample predictions, which I ignore in this exercise!

Hints:

How to estimate the RATE can be seen here

### BEGIN SOLUTION

## Calculate RATE
#rate = rank_average_treatment_effect(cf, tau.hat, target = "AUTOC")

## Results
#print(rate)
## estimate  std.err             target
## 29.55353 429.4161 priorities | AUTOC

### END SOLUTION

Exercise 2.8

Examine what variables drive the heterogeneity using the split based feature importance method.

Hints:

A function to calculate the variable importance can be found in the grf reference

### BEGIN SOLUTION

## Calculate variable importance
#varimp = variable_importance(cf)
## Print results
#print(varimp)
##           [,1]
##[1,] 0.29462337
##[2,] 0.15099209
##[3,] 0.03964410
##[4,] 0.01855247
##[5,] 0.07758364
##[6,] 0.02579878
##[7,] 0.39280555

## Same results as before, age and re75


### END SOLUTION

Exercise 2.9

Examine how the variables affect heterogeneity using the best linear projection.

Hints:

A function to estimate the best linear projection can be found in the grf reference

### BEGIN SOLUTION


## Rank vars and get best linear projection
#ranked.vars = order(varimp, decreasing = TRUE)
#best_lin_proj = best_linear_projection(cf, X[ranked.vars])

## Print results
#print(best_lin_proj)

## Results
## Best linear projection of the conditional average treatment effect.
##Confidence intervals are cluster- and heteroskedasticity-robust (HC3):
##
##               Estimate  Std. Error t value Pr(>|t|)   
##(Intercept) -5117.83175  5808.15316 -0.8811 0.378542   
##re75           -0.32801     0.10494 -3.1257 0.001847 **
##age            40.38223    80.12389  0.5040 0.614422   
##education     481.73107   383.02820  1.2577 0.208925   
##married      2617.70317  1420.71210  1.8425 0.065822 . 
##black         491.84656  1608.76159  0.3057 0.759902   
##nodegree      383.19631  1667.00113  0.2299 0.818259   
##hispanic     -818.57994  2099.72022 -0.3899 0.696765   
##---
##Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

### END SOLUTION