dask_ml.preprocessing.OrdinalEncoder
dask_ml.preprocessing
.OrdinalEncoder¶
- class dask_ml.preprocessing.OrdinalEncoder(columns=None)¶
Ordinal (integer) encode categorical columns.
- Parameters
- columnssequence, optional
The columns to encode. Must be categorical dtype. Encodes all categorical dtype columns by default.
- Attributes
- columns_Index
The columns in the training data before/after encoding
- categorical_columns_Index
The categorical columns in the training data
- noncategorical_columns_Index
The rest of the columns in the training data
- dtypes_dict
Dictionary mapping column name to either
instances of CategoricalDtype (pandas >= 0.21.0)
tuples of (categories, ordered)
Notes
This transformer only applies to dask and pandas DataFrames. For dask DataFrames, all of your categoricals should be known.
The inverse transformation can be used on a dataframe or array.
Examples
>>> data = pd.DataFrame({"A": [1, 2, 3, 4], ... "B": pd.Categorical(['a', 'a', 'a', 'b'])}) >>> enc = OrdinalEncoder() >>> trn = enc.fit_transform(data) >>> trn A B 0 1 0 1 2 0 2 3 0 3 4 1
>>> enc.columns_ Index(['A', 'B'], dtype='object')
>>> enc.non_categorical_columns_ Index(['A'], dtype='object')
>>> enc.categorical_columns_ Index(['B'], dtype='object')
>>> enc.dtypes_ {'B': CategoricalDtype(categories=['a', 'b'], ordered=False)}
>>> enc.fit_transform(dd.from_pandas(data, 2)) Dask DataFrame Structure: A B npartitions=2 0 int64 int8 2 ... ... 3 ... ... Dask Name: assign, 8 tasks
Methods
fit
(X[, y])Determine the categorical columns to be encoded.
fit_transform
(X[, y])Fit to data, then transform it.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
inverse_transform
(X)Inverse ordinal-encode the columns in X
set_output
(*[, transform])Set output container.
set_params
(**params)Set the parameters of this estimator.
transform
(X[, y])Ordinal encode the categorical columns in X
- __init__(columns=None)¶