fklearn.causal package¶
Subpackages¶
Submodules¶
fklearn.causal.debias module¶
-
fklearn.causal.debias.
debias_with_double_ml
[source]¶ Frisch-Waugh-Lovell style debiasing with ML model. To debias, we
- fit a regression ml model to predict the treatment from the confounders and take out of fold residuals from
this fit (debias step)- fit a regression ml model to predict the outcome from the confounders and take the out of fold residuals from
this fit (denoise step).We then add back the average outcome and treatment so that their levels remain unchanged.
Returns a dataframe with the debiased columns with suffix appended to the name
Parameters: - df (Pandas DataFrame) – A Pandas’ DataFrame with with treatment, outcome and confounder columns
- treatment_column (str) – The name of the column in df with the treatment.
- outcome_column (str) – The name of the column in df with the outcome.
- confounder_columns (list of str) – A list of confounder present in df
- ml_regressor (Sklearn's RegressorMixin) – A regressor model that implements a fit and a predict method
- extra_params (dict) – The hyper-parameters for the model
- cv (int) – The number of folds to cross predict
- suffix (str) – A suffix to append to the returning debiased column names.
- denoise (bool (Default=True)) – If it should denoise the outcome using the confounders or not
- seed (int) – A seed for consistency in random computation
Returns: debiased_df – The original df dataframe with debiased columns added.
Return type: Pandas DataFrame
-
fklearn.causal.debias.
debias_with_fixed_effects
[source]¶ Returns a dataframe with the debiased columns with suffix appended to the name
This is equivalent of debiasing with regression where the forumla is “C(x1) + C(x2) + …”. However, it is much more eficient than runing such a dummy variable regression.
Parameters: - df (Pandas DataFrame) – A Pandas’ DataFrame with with treatment, outcome and confounder columns
- treatment_column (str) – The name of the column in df with the treatment.
- outcome_column (str) – The name of the column in df with the outcome.
- confounder_columns (list of str) – Confounders are categorical groups we wish to explain away. Some examples are units (ex: customers), and time (day, months…). We perform a group by on these columns, so they should not be continuous variables.
- suffix (str) – A suffix to append to the returning debiased column names.
- denoise (bool (Default=True)) – If it should denoise the outcome using the confounders or not
Returns: debiased_df – The original df dataframe with debiased columns added.
Return type: Pandas DataFrame
-
fklearn.causal.debias.
debias_with_regression
[source]¶ Frisch-Waugh-Lovell style debiasing with linear regression. To debias, we
1) fit a linear model to predict the treatment from the confounders and take the residuals from this fit (debias step) 2) fit a linear model to predict the outcome from the confounders and take the residuals from this fit (denoise step).We then add back the average outcome and treatment so that their levels remain unchanged.
Returns a dataframe with the debiased columns with suffix appended to the name
Parameters: - df (Pandas DataFrame) – A Pandas’ DataFrame with with treatment, outcome and confounder columns
- treatment_column (str) – The name of the column in df with the treatment.
- outcome_column (str) – The name of the column in df with the outcome.
- confounder_columns (list of str) – A list of confounder present in df
- suffix (str) – A suffix to append to the returning debiased column names.
- denoise (bool (Default=True)) – If it should denoise the outcome using the confounders or not
Returns: debiased_df – The original df dataframe with debiased columns added.
Return type: Pandas DataFrame
-
fklearn.causal.debias.
debias_with_regression_formula
[source]¶ Frisch-Waugh-Lovell style debiasing with linear regression. With R formula to define confounders. To debias, we
1) fit a linear model to predict the treatment from the confounders and take the residuals from this fit (debias step) 2) fit a linear model to predict the outcome from the confounders and take the residuals from this fit (denoise step).We then add back the average outcome and treatment so that their levels remain unchanged.
Returns a dataframe with the debiased columns with suffix appended to the name
Parameters: - df (Pandas DataFrame) – A Pandas’ DataFrame with with treatment, outcome and confounder columns
- treatment_column (str) – The name of the column in df with the treatment.
- outcome_column (str) – The name of the column in df with the outcome.
- confounder_formula (str) – An R formula modeling the confounders. Check https://www.statsmodels.org/dev/example_formulas.html for examples.
- suffix (str) – A suffix to append to the returning debiased column names.
- denoise (bool (Default=True)) – If it should denoise the outcome using the confounders or not
Returns: debiased_df – The original df dataframe with debiased columns added.
Return type: Pandas DataFrame
fklearn.causal.effects module¶
-
fklearn.causal.effects.
exponential_coefficient_effect
[source]¶ Computes the exponential coefficient between the treatment and the outcome. Finds a1 in the following equation outcome = exp(a0 + a1 treatment) + error
Parameters: - df (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores.
- treatment_column (str) – The name of the treatment column in df.
- outcome_column (str) – The name of the outcome column in df.
Returns: effect – The exponential coefficient between the treatment and the outcome
Return type: float
-
fklearn.causal.effects.
linear_effect
[source]¶ cov(outcome, treatment)/var(treatment)
Parameters: - df (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores.
- treatment_column (str) – The name of the treatment column in df.
- outcome_column (str) – The name of the outcome column in df.
Returns: effect – The linear coefficient from regressing the outcome on the treatment: cov(outcome, treatment)/var(treatment)
Return type: float
Type: Computes the linear coefficient from regressing the outcome on the treatment
-
fklearn.causal.effects.
logistic_coefficient_effect
[source]¶ Computes the logistic coefficient between the treatment and the outcome. Finds a1 in the following equation outcome = logistic(a0 + a1 treatment)
Parameters: - df (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores.
- treatment_column (str) – The name of the treatment column in df.
- outcome_column (str) – The name of the outcome column in df.
Returns: effect – The logistic coefficient between the treatment and the outcome
Return type: float
-
fklearn.causal.effects.
pearson_effect
[source]¶ Computes the Pearson correlation between the treatment and the outcome
Parameters: - df (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores.
- treatment_column (str) – The name of the treatment column in df.
- outcome_column (str) – The name of the outcome column in df.
Returns: effect – The Pearson correlation between the treatment and the outcome
Return type: float
-
fklearn.causal.effects.
spearman_effect
[source]¶ Computes the Spearman correlation between the treatment and the outcome
Parameters: - df (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores.
- treatment_column (str) – The name of the treatment column in df.
- outcome_column (str) – The name of the outcome column in df.
Returns: effect – The Spearman correlation between the treatment and the outcome
Return type: float