dask_ml.impute.SimpleImputer

class dask_ml.impute.SimpleImputer(*, missing_values=nan, strategy='mean', fill_value=None, verbose=0, copy=True, add_indicator=False)

Methods

fit(X[, y]) Fit the imputer on X.
fit_transform(X[, y]) Fit to data, then transform it.
get_params([deep]) Get parameters for this estimator.
inverse_transform(X) Convert the data back to the original representation.
set_params(**params) Set the parameters of this estimator.
transform(X) Impute all missing values in X.
__init__(*, missing_values=nan, strategy='mean', fill_value=None, verbose=0, copy=True, add_indicator=False)

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None)

Fit the imputer on X.

Parameters:
X : {array-like, sparse matrix}, shape (n_samples, n_features)

Input data, where n_samples is the number of samples and n_features is the number of features.

Returns:
self : SimpleImputer
fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
X : array-like of shape (n_samples, n_features)

Input samples.

y : array-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_params : dict

Additional fit parameters.

Returns:
X_new : ndarray array of shape (n_samples, n_features_new)

Transformed array.

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : dict

Parameter names mapped to their values.

inverse_transform(X)

Convert the data back to the original representation.

Inverts the transform operation performed on an array. This operation can only be performed after SimpleImputer is instantiated with add_indicator=True.

Note that inverse_transform can only invert the transform in features that have binary indicators for missing values. If a feature has no missing values at fit time, the feature won’t have a binary indicator, and the imputation done at transform time won’t be inverted.

New in version 0.24.

Parameters:
X : array-like of shape (n_samples, n_features + n_features_missing_indicator)

The imputed data to be reverted to original data. It has to be an augmented array of imputed data and the missing indicator mask.

Returns:
X_original : ndarray of shape (n_samples, n_features)

The original X with missing values as it was prior to imputation.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**params : dict

Estimator parameters.

Returns:
self : estimator instance

Estimator instance.

transform(X)

Impute all missing values in X.

Parameters:
X : {array-like, sparse matrix}, shape (n_samples, n_features)

The input data to complete.