fklearn.causal.cate_learning package

Submodules

fklearn.causal.cate_learning.double_machine_learning module

fklearn.causal.cate_learning.double_machine_learning.non_parametric_double_ml_learner[source]

Fits an Non-Parametric Double/ML Meta Learner for Conditional Average Treatment Effect Estimation. It implements the following steps: 1) fits k instances of the debias model to predict the treatment from the features and get out-of-fold residuals

t_res=t-t_hat;
  1. fits k instances of the denoise model to predict the outcome from the features and get out-of-fold residuals
    y_res=y-y_hat;
  2. fits a final ML model to predict y_res / t_res from the features using weighted regression with weights set to
    t_res^2. Trained like this, the final model will output treatment effect predictions.
Parameters:
  • df (pandas.DataFrame) – A Pandas’ DataFrame with features, treatment and target columns. The model will be trained to predict the target column from the features.
  • feature_columns (list of str) –
    A list os column names that are used as features for the denoise, debias and final models in double-ml. All
    this names should be in df.
  • treatment_column (str) –
    The name of the column in df that should be used as treatment for the double-ml model. It will learn the
    impact of this column with respect to the outcome column.
  • outcome_column (str) – The name of the column in df that should be used as outcome for the double-ml model. It will learn the impact of the treatment column on this outcome column.
  • debias_model (RegressorMixin (default None)) – The estimator for fitting the treatment from the features. Must implement fit and predict methods. It can be an scikit-learn regressor. When None, defaults to GradientBoostingRegressor.
  • debias_feature_columns (list of str (default None)) – A list os column names to be used only for the debias model. If not None, it will replace feature_columns when fitting the debias model.
  • denoise_model (RegressorMixin (default None)) – The estimator for fitting the outcome from the features. Must implement fit and predict methods. It can be an scikit-learn regressor. When None, defaults to GradientBoostingRegressor.
  • denoise_feature_columns (list of str (default None)) – A list os column names to be used only for the denoise model. If not None, it will replace feature_columns when fitting the denoise model.
  • final_model (RegressorMixin (default None)) – The estimator for fitting the outcome residuals from the treatment residuals. Must implement fit and predict methods. It can be an arbitrary scikit-learn regressor. The fit method must accept sample_weight as a keyword argument. When None, defaults to GradientBoostingRegressor.
  • final_model_feature_columns (list of str (default None)) – A list os column names to be used only for the final model. If not None, it will replace feature_columns when fitting the final model.
  • prediction_column (str (default "prediction")) – The name of the column with the treatment effect predictions from the final model.
  • cv_splits (int (default 2)) – Number of folds to split the training data when fitting the debias and denoise models
  • encode_extra_cols (bool (default: True)) – If True, treats all columns in df with name pattern fklearn_feat__col==val` as feature columns.
Returns:

  • p (function pandas.DataFrame -> pandas.DataFrame) – A function that when applied to a DataFrame with the same columns as df returns a new DataFrame with a new column with predictions from the model.
  • new_df (pandas.DataFrame) – A df-like DataFrame with the same columns as the input df plus a column with predictions from the model.
  • log (dict) – A log-like Dict that stores information of the Non Parametric Double/ML model.

fklearn.causal.cate_learning.meta_learners module

Module contents