fklearn.validation package¶

Submodules¶

fklearn.validation.evaluators module¶

fklearn.validation.evaluators.auc_evaluator[source]¶

Computes the ROC AUC score, given true label and prediction scores.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (Strings) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the ROC AUC Score
Return type:	dict

fklearn.validation.evaluators.brier_score_evaluator[source]¶

Computes the Brier score, given true label and prediction scores.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (Strings) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – The name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the Brier score.
Return type:	dict

fklearn.validation.evaluators.combined_evaluators[source]¶

Combine partially applies evaluation functions.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame to apply the evaluators on evaluators (List) – List of evaluator functions
Returns:	log – A log-like dictionary with the column mean
Return type:	dict

fklearn.validation.evaluators.correlation_evaluator[source]¶

Computes the Pearson correlation between prediction and target.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction. prediction_column (Strings) – The name of the column in test_data with the prediction. target_column (String) – The name of the column in test_data with the continuous target. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the Pearson correlation
Return type:	dict

fklearn.validation.evaluators.expected_calibration_error_evaluator[source]¶

Computes the expected calibration error (ECE), given true label and prediction scores. See “On Calibration of Modern Neural Networks”(https://arxiv.org/abs/1706.04599) for more information.

The ECE is the distance between the actuals observed frequency and the predicted probabilities, for a given choice of bins.

Perfect calibration results in a score of 0.

For example, if for the bin [0, 0.1] we have the three data points:

prediction: 0.1, actual: 0
prediction: 0.05, actual: 1
prediction: 0.0, actual 0

Then the predicted average is (0.1 + 0.05 + 0.00)/3 = 0.05, and the empirical frequency is (0 + 1 + 0)/3 = 1/3. Therefore, the distance for this bin is:

|1/3 - 0.05| ~= 0.28.

Graphical intuition:

Actuals (empirical frequency between 0 and 1)
|     *
|   *
| *
 ______ Predictions (probabilties between 0 and 1)

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (Strings) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the binary target. eval_name (String, optional (default=None)) – The name of the evaluator as it will appear in the logs. n_bins (Int (default=100)) – The number of bins. This is a trade-off between the number of points in each bin and the probability range they span. You want a small enough range that still contains a significant number of points for the distance to work. bin_choice (String (default="count")) – Two possibilities: “count” for equally populated bins (e.g. uses pandas.qcut for the bins) “prob” for equally spaced probabilities (e.g. uses pandas.cut for the bins), with distance weighed by the number of samples in each bin.
Returns:	log – A log-like dictionary with the expected calibration error.
Return type:	dict

fklearn.validation.evaluators.exponential_coefficient_evaluator[source]¶

Computes the exponential coefficient between prediction and target. Finds a1 in the following equation target = exp(a0 + a1 prediction)

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with with target and prediction. prediction_column (Strings) – The name of the column in test_data with the prediction. target_column (String) – The name of the column in test_data with the continuous target. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the exponential coefficient
Return type:	dict

fklearn.validation.evaluators.fbeta_score_evaluator[source]¶

Computes the F-beta score, given true label and prediction scores.

Parameters:	test_data (pandas.DataFrame) – A Pandas’ DataFrame with target and prediction scores. threshold (float) – A threshold for the prediction column above which samples will be classified as 1 beta (float) – The beta parameter determines the weight of precision in the combined score. beta < 1 lends more weight to precision, while beta > 1 favors recall (beta -> 0 considers only precision, beta -> inf only recall). prediction_column (str) – The name of the column in test_data with the prediction scores. target_column (str) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (str, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the Precision Score
Return type:	dict

fklearn.validation.evaluators.generic_sklearn_evaluator(name_prefix: str, sklearn_metric: Callable[[...], float]) → Callable[[...], Dict[str, Union[float, Dict]]][source]¶

Returns an evaluator build from a metric from sklearn.metrics

Parameters:	name_prefix (str) – The default name of the evaluator will be name_prefix + target_column. sklearn_metric (Callable) – Metric function from sklearn.metrics. It should take as parameters y_true, y_score, kwargs.
Returns:	eval_fn – An evaluator function that uses the provided metric
Return type:	Callable

fklearn.validation.evaluators.hash_evaluator[source]¶

Computes the hash of a pandas dataframe, filtered by hash columns. The purpose is to uniquely identify a dataframe, to be able to check if two dataframes are equal or not.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame to be hashed. hash_columns (List[str], optional (default=None)) – A list of column names to filter the dataframe before hashing. If None, it will hash the dataframe with all the columns eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs. consider_index (bool, optional (default=False)) – If true, will consider the index of the dataframe to calculate the hash. The default behaviour will ignore the index and just hash the content of the features.
Returns:	log – A log-like dictionary with the hash of the dataframe
Return type:	dict

fklearn.validation.evaluators.linear_coefficient_evaluator[source]¶

Computes the linear coefficient from regressing the outcome on the prediction

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with with target and prediction. prediction_column (Strings) – The name of the column in test_data with the prediction. target_column (String) – The name of the column in test_data with the continuous target. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the linear coefficient from regressing the outcome on the prediction
Return type:	dict

fklearn.validation.evaluators.logistic_coefficient_evaluator[source]¶

Computes the logistic coefficient between prediction and target. Finds a1 in the following equation target = logistic(a0 + a1 prediction)

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with with target and prediction. prediction_column (Strings) – The name of the column in test_data with the prediction. target_column (String) – The name of the column in test_data with the continuous target. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the logistic coefficient
Return type:	dict

fklearn.validation.evaluators.logloss_evaluator[source]¶

Computes the logloss score, given true label and prediction scores.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (Strings) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the logloss score.
Return type:	dict

fklearn.validation.evaluators.mean_prediction_evaluator[source]¶

Computes mean for the specified column.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with a column to compute the mean prediction_column (Strings) – The name of the column in test_data to compute the mean. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the column mean
Return type:	dict

fklearn.validation.evaluators.mse_evaluator[source]¶

Computes the Mean Squared Error, given true label and predictions.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and predictions. prediction_column (Strings) – The name of the column in test_data with the predictions. target_column (String) – The name of the column in test_data with the continuous target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the MSE Score
Return type:	dict

fklearn.validation.evaluators.ndcg_evaluator[source]¶

Computes the Normalized Discount Cumulative Gain (NDCG) between of the original and predicted rankings: https://en.wikipedia.org/wiki/Discounted_cumulative_gain

Parameters:	test_data (Pandas DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (String) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the target. k (int, optional (default=None)) – The size of the rank that is used to fit (highest k scores) the NDCG score. If None, use all outputs. Otherwise, this value must be between [1, len(test_data[prediction_column])]. exponential_gain (bool (default=True)) – If False, then use the linear gain. The exponential gain places a stronger emphasis on retrieving relevant items. If the relevance of these items is binary values in {0,1}, then the two approaches are the same, which is the linear case. eval_name (String, optional (default=None)) – The name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the NDCG score, float in [0,1].
Return type:	dict

fklearn.validation.evaluators.permutation_evaluator[source]¶

Permutation importance evaluator. It works by shuffling one or more features on test_data dataframe, getting the preditions with predict_fn, and evaluating the results with eval_fn.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target, predictions and features. predict_fn (function DataFrame -> DataFrame) – Function that receives the input dataframe and returns a dataframe with the pipeline predictions. eval_fn (function DataFrame -> Log Dict) – A partially applied evaluation function. baseline (bool) – Also evaluates the predict_fn on an unshuffled baseline. features (List of strings) – The features to shuffle and then evaluate eval_fn on the shuffled results. The default case shuffles all dataframe columns. shuffle_all_at_once (bool) – Shuffle all features at once instead of one per turn. random_state (int) – Seed to be used by the random number generator. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with evaluation results by feature shuffle. Use the permutation_extractor for better visualization of the results.
Return type:	dict

fklearn.validation.evaluators.pr_auc_evaluator[source]¶

Computes the PR AUC score, given true label and prediction scores.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (Strings) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:
Return type:	A log-like dictionary with the PR AUC Score

fklearn.validation.evaluators.precision_evaluator[source]¶

Computes the precision score, given true label and prediction scores.

Parameters:	test_data (pandas.DataFrame) – A Pandas’ DataFrame with target and prediction scores. threshold (float) – A threshold for the prediction column above which samples will be classified as 1 prediction_column (str) – The name of the column in test_data with the prediction scores. target_column (str) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (str, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the Precision Score
Return type:	dict

fklearn.validation.evaluators.r2_evaluator[source]¶

Computes the R2 score, given true label and predictions.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction. prediction_column (Strings) – The name of the column in test_data with the prediction. target_column (String) – The name of the column in test_data with the continuous target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the R2 Score
Return type:	dict

fklearn.validation.evaluators.recall_evaluator[source]¶

Computes the recall score, given true label and prediction scores.

Parameters:	test_data (pandas.DataFrame) – A Pandas’ DataFrame with target and prediction scores. threshold (float) – A threshold for the prediction column above which samples will be classified as 1 prediction_column (str) – The name of the column in test_data with the prediction scores. target_column (str) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (str, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the Precision Score
Return type:	dict

fklearn.validation.evaluators.roc_auc_evaluator[source]¶

Computes the ROC AUC score, given true label and prediction scores.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction scores. prediction_column (Strings) – The name of the column in test_data with the prediction scores. target_column (String) – The name of the column in test_data with the binary target. weight_column (String (default=None)) – The name of the column in test_data with the sample weights. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the ROC AUC Score
Return type:	dict

fklearn.validation.evaluators.spearman_evaluator[source]¶

Computes the Spearman correlation between prediction and target. The Spearman correlation evaluates the rank order between two variables: https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and prediction. prediction_column (Strings) – The name of the column in test_data with the prediction. target_column (String) – The name of the column in test_data with the continuous target. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with the Spearman correlation
Return type:	dict

fklearn.validation.evaluators.split_evaluator[source]¶

Splits the dataset into the categories in split_col and evaluate model performance in each split. Useful when you belive the model performs differs in a sub population defined by split_col.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and predictions. eval_fn (function DataFrame -> Log Dict) – A partially applied evaluation function. split_col (String) – The name of the column in test_data to split by. split_values (Array, optional (default=None)) – An Array to split by. If not provided, test_data[split_col].unique() will be used. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with evaluation results by split.
Return type:	dict

fklearn.validation.evaluators.temporal_split_evaluator[source]¶

Splits the dataset into the temporal categories by time_col and evaluate model performance in each split.

The splits are implicitly defined by the time_format. For example, for the default time format (“%Y-%m”), we will split by year and month.

Parameters:	test_data (Pandas' DataFrame) – A Pandas’ DataFrame with target and predictions. eval_fn (function DataFrame -> Log Dict) – A partially applied evaluation function. time_col (string) – The name of the column in test_data to split by. time_format (string) – The way to format the time_col into temporal categories. split_values (Array of string, optional (default=None)) – An array of date formatted strings to split the evaluation by. If not provided, all unique formatted dates will be used. eval_name (String, optional (default=None)) – the name of the evaluator as it will appear in the logs.
Returns:	log – A log-like dictionary with evaluation results by split.
Return type:	dict

fklearn.validation.perturbators module¶

fklearn.validation.perturbators.nullify[source]¶

Replace a percenteage of values in the input Series by np.nan

Parameters:	col (pd.Series) – A Pandas’ Series perc (float) – Percentage to be replaced by no.nan
Returns:
Return type:	A transformed pd.Series

fklearn.validation.perturbators.perturbator[source]¶

transforms specific columns of a dataset according to an artificial corruption function.

Parameters:	data (pandas.DataFrame) – A Pandas’ DataFrame cols (List[str]) – A list of columns to apply the corruption function corruption_fn (function pandas.Series -> pandas.Series) – An arbitrary corruption function
Returns:
Return type:	A transformed dataset

fklearn.validation.perturbators.random_noise[source]¶

Fit a gaussian to column, then sample and add to each entry with a magnification parameter

Parameters:	col (pd.Series) – A Pandas’ Series mag (float) – Multiplies the noise to control scaling
Returns:
Return type:	A transformed pd.Series

fklearn.validation.perturbators.sample_columns[source]¶

Helper function that picks randomly a percentage of the columns

Parameters:	data (pd.DataFrame) – A Pandas’ DataFrame perc (float) – Percentage of columns to be sampled
Returns:
Return type:	A list of column names

fklearn.validation.perturbators.shift_mu[source]¶

Shift the mean of column by a given percentage

Parameters:	col (pd.Series) – A Pandas’ Series perc (float) – How much to shift the mu percentually (can be negative)
Returns:
Return type:	A transformed pd.Series

fklearn.validation.splitters module¶

fklearn.validation.splitters.forward_stability_curve_time_splitter[source]¶

Splits the data into temporal buckets with both the training and testing folds both moving forward. The folds move forward by a fixed timedelta step. Optionally, there can be a gap between the end of the training period and the start of the holdout period.

Similar to the stability curve time splitter, with the difference that the training period also moves forward with each fold.

The clearest use case is to evaluate a periodic re-training framework.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for stability curve estimation.
training_time_start (datetime.datetime or str) – Date for the start of the training period. If move_training_start_with_steps is True, each step will increase this date by step.
training_time_end (datetime.datetime or str) – Date for the end of the training period. Each step increases this date by step.
time_column (str) – The name of the Date column of train_data.
holdout_gap (datetime.timedelta) – Timedelta of the gap between the end of the training period and the start of the validation period.
holdout_size (datetime.timedelta) – Timedelta of the range between the start and the end of the holdout period.
step (datetime.timedelta) – Timedelta that shifts both the training period and the holdout period by this value.
move_training_start_with_steps (bool) – If True, the training start date will increase by step for each fold. If False, the training start date remains fixed at the training_time_start value.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.k_fold_splitter[source]¶

Makes K random train/test split folds for cross validation. The folds are made so that every sample is used at least once for evaluating and K-1 times for training.

If stratified is set to True, the split preserves the distribution of stratify_column

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split into K-Folds for cross validation.
n_splits (int) – The number of folds K for the K-Fold cross validation strategy.
random_state (int) – Seed to be used by the random number generator.
stratify_column (string) – Column name in train_data to be used for stratified split.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.out_of_time_and_space_splitter[source]¶

Makes K grouped train/test split folds for cross validation. The folds are made so that every ID is used at least once for evaluating and K-1 times for training. Also, for each fold, evaluation will always be out-of-ID and out-of-time.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split into K out-of-time and ID folds for cross validation.
n_splits (int) – The number of folds K for the K-Fold cross validation strategy.
in_time_limit (str or datetime.datetime) – A String representing the end time of the training data. It should be in the same format as the Date column in train_data.
time_column (str) – The name of the Date column of train_data.
space_column (str) – The name of the ID column of train_data.
holdout_gap (datetime.timedelta) – Timedelta of the gap between the end of the training period and the start of the validation period.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.reverse_time_learning_curve_splitter[source]¶

Splits the data into temporal buckets given by the specified frequency. Uses a fixed out-of-ID and time hold out set for every fold. Training size increases per fold, with less recent data being added in each fold. Useful for inverse learning curve validation, that is, for seeing how hold out performance increases as the training size increases with less recent data.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split inverse learning curve estimation.
time_column (str) – The name of the Date column of train_data.
training_time_limit (str) – The Date String for the end of the training period. Should be of the same format as time_column.
lower_time_limit (str) – A Date String for the begining of the training period. This allows limiting the learning curve from bellow, avoiding heavy computation with very old data.
freq (str) – The temporal frequency. See: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
holdout_gap (datetime.timedelta) – Timedelta of the gap between the end of the training period and the start of the validation period.
min_samples (int) – The minimum number of samples required in the split to keep the split.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.spatial_learning_curve_splitter[source]¶

Splits the data for a spatial learning curve. Progressively adds more and more examples to the training in order to verify the impact of having more data available on a validation set.

The validation set starts after the training set, with an optional time gap.

Similar to the temporal learning curves, but with spatial increases in the training set.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for learning curve estimation.
space_column (str) – The name of the ID column of train_data.
time_column (str) – The name of the temporal column of train_data.
training_limit (datetime or str) – The date limiting the training (after which the holdout begins).
holdout_gap (timedelta) – The gap between the end of training and the start of the holdout. If you have censored data, use a gap similar to the censor time.
train_percentages (list or tuple of floats) – A list containing the percentages of IDs to use in the training. Defaults to (0.25, 0.5, 0.75, 1.0). For example: For the default value, there would be four model trainings, containing respectively 25%, 50%, 75%, and 100% of the IDs that are not part of the held out set.
random_state (int) – A seed for the random number generator that shuffles the IDs.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.stability_curve_time_in_space_splitter[source]¶

Splits the data into temporal buckets given by the specified frequency. Training set is fixed before hold out and uses a rolling window hold out set. Each fold moves the hold out further into the future. Useful to see how model performance degrades as the training data gets more outdated. Folds are made so that ALL IDs in the holdout also appear in the training set.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for stability curve estimation.
training_time_limit (str) – The Date String for the end of the testing period. Should be of the same format as time_column.
space_column (str) – The name of the ID column of train_data.
time_column (str) – The name of the Date column of train_data.
freq (str) – The temporal frequency. See: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
space_hold_percentage (float (default=0.5)) – The proportion of hold out IDs.
random_state (int) – A seed for the random number generator for ID sampling across train and hold out sets.
min_samples (int) – The minimum number of samples required in the split to keep the split.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.stability_curve_time_space_splitter[source]¶

Splits the data into temporal buckets given by the specified frequency. Training set is fixed before hold out and uses a rolling window hold out set. Each fold moves the hold out further into the future. Useful to see how model performance degrades as the training data gets more outdated. Folds are made so that NONE of the IDs in the holdout appears in the training set.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for stability curve estimation.
training_time_limit (str) – The Date String for the end of the testing period. Should be of the same format as time_column
space_column (str) – The name of the ID column of train_data
time_column (str) – The name of the Date column of train_data
freq (str) – The temporal frequency. See: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
space_hold_percentage (float) – The proportion of hold out IDs
random_state (int) – A seed for the random number generator for ID sampling across train and hold out sets.
min_samples (int) – The minimum number of samples required in the split to keep the split.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.stability_curve_time_splitter[source]¶

Splits the data into temporal buckets given by the specified frequency. Training set is fixed before hold out and uses a rolling window hold out set. Each fold moves the hold out further into the future. Useful to see how model performance degrades as the training data gets more outdated. Training and holdout sets can have same IDs

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for stability curve estimation.
training_time_limit (str) – The Date String for the end of the testing period. Should be of the same format as time_column.
time_column (str) – The name of the Date column of train_data.
freq (str) – The temporal frequency. See: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
min_samples (int) – The minimum number of samples required in a split to keep it.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.time_and_space_learning_curve_splitter[source]¶

Splits the data into temporal buckets given by the specified frequency. Uses a fixed out-of-ID and time hold out set for every fold. Training size increases per fold, with more recent data being added in each fold. Useful for learning curve validation, that is, for seeing how hold out performance increases as the training size increases with more recent data.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for learning curve estimation.
training_time_limit (str) – The Date String for the end of the testing period. Should be of the same format as time_column.
space_column (str) – The name of the ID column of train_data.
time_column (str) – The name of the Date column of train_data.
freq (str) – The temporal frequency. See: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
space_hold_percentage (float) – The proportion of hold out IDs.
holdout_gap (datetime.timedelta) – Timedelta of the gap between the end of the training period and the start of the validation period.
random_state (int) – A seed for the random number generator for ID sampling across train and hold out sets.
min_samples (int) – The minimum number of samples required in the split to keep the split.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation.splitters.time_learning_curve_splitter[source]¶

Splits the data into temporal buckets given by the specified frequency.

Uses a fixed out-of-ID and time hold out set for every fold. Training size increases per fold, with more recent data being added in each fold. Useful for learning curve validation, that is, for seeing how hold out performance increases as the training size increases with more recent data.

Parameters:

train_data (pandas.DataFrame) – A Pandas’ DataFrame that will be split for learning curve estimation.
training_time_limit (str) – The Date String for the end of the testing period. Should be of the same format as time_column.
time_column (str) – The name of the Date column of train_data.
freq (str) – The temporal frequency. See: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
holdout_gap (datetime.timedelta) – Timedelta of the gap between the end of the training period and the start of the validation period.
min_samples (int) – The minimum number of samples required in the split to keep the split.

Returns:

Folds (list of tuples) – A list of folds. Each fold is a Tuple of arrays. The fist array in each tuple contains training indexes while the second array contains validation indexes.
logs (list of dict) – A list of logs, one for each fold

fklearn.validation package¶

Submodules¶

fklearn.validation.evaluators module¶

fklearn.validation.perturbators module¶

fklearn.validation.splitters module¶

fklearn.validation.validator module¶

Module contents¶