dask_ml.preprocessing.BlockTransformer
dask_ml.preprocessing
.BlockTransformer¶
- class dask_ml.preprocessing.BlockTransformer(func: Callable[[...], Union[dask_ml._typing.ArrayLike, pandas.core.frame.DataFrame, dask_expr._collection.DataFrame]], *, validate: bool = False, **kw_args: Any)¶
Construct a transformer from a an arbitrary callable
The BlockTransformer forwards the blocks of the X arguments to a user-defined callable and returns the result of this operation. This is useful for stateless operations, that can be performed on the cell or block level, such as taking the log of frequencies. In general the transformer is not suitable for e.g. standardization tasks as this requires information for a complete column.
- Parameters
- funccallable
The callable to use for the transformation.
- validatebool, optional default=False
Indicate that the input X array should be checked before calling
func
.- kw_argsdict, optional
Dictionary of additional keyword arguments to pass to func.
Examples
>>> import dask.datasets >>> import pandas as pd >>> from dask_ml.preprocessing import BlockTransformer >>> df = dask.datasets.timeseries() >>> df ... Dask DataFrame Structure: id name x y npartitions=30 2000-01-01 int64 object float64 float64 2000-01-02 ... ... ... ... ... ... ... ... ... 2000-01-30 ... ... ... ... 2000-01-31 ... ... ... ... Dask Name: make-timeseries, 30 tasks >>> trn = BlockTransformer(pd.util.hash_pandas_object, index=False) >>> trn.transform(df) ... Dask Series Structure: npartitions=30 2000-01-01 uint64 2000-01-02 ... ... 2000-01-30 ... 2000-01-31 ... dtype: uint64 Dask Name: hash_pandas_object, 60 tasks
Methods
fit_transform
(X[, y])Fit to data, then transform it.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
set_output
(*[, transform])Set output container.
set_params
(**params)Set the parameters of this estimator.
fit
transform