Version 1.7.0

Version 1.6.0

Version 1.5.0

Version 1.4.0

Version 1.3.0

Version 1.2.0

  • Changed the name of the second positional argument in model_selection.IncrementalSearchCV from param_distribution to parameters to match the name of the base class.
  • Compatibility with scikit-learn 0.22.1.
  • Added dask_ml.preprocessing.BlockTransfomer an extension of scikit-learn’s FunctionTransformer (GH#366).
  • Added dask_ml.feature_extraction.FeatureHasher which is similar to scikit-learn’s implementation.

Version 1.1.1

  • Fixed an issue with the 1.1.0 wheel (GH#575)
  • Make svd_flip work even when arrays are read only (GH#592)

Version 1.1.0

Version 1.0.0

Version 0.13.0


dask-ml 0.13.0 will be the last release to support Python 2.

Version 0.12.0

API Breaking Changes

Version 0.11.0

Note that this version of Dask-ML requires scikit-learn >= 0.20.0.


Bug Fixes

Version 0.10.0

Version 0.9.0

Bug Fixes

Documentation Updates

Build Changes

We’re now using Numba for performance-sensitive parts of Dask-ML. Dask-ML is now a pure-python project, so we can provide universal wheels.

Version 0.8.0


  • Automatically replace default scikit-learn scorers with dask-aware versions in Incremental (GH#200)
  • Added the dask_ml.metrics.log_loss() loss function and neg_log_loss scorer (GH#318)
  • Fixed handling of array-like fit-parameters to GridSearchCV and BaseSearchCV (GH#320)

Bug Fixes

  • Fixed dtype in LabelEncoder.fit_transform() to be integer, rather than the dtype of the classes for dask arrays (GH#311)

Version 0.7.0


API Breaking Changes

  • Removed the basis_inds_ attribute from dask_ml.cluster.SpectralClustering as its no longer used (GH#152)

  • Change to clone the underlying estimator before training (GH#258). This induces a few changes

    1. The underlying estimator no longer gives access to learned attributes like coef_. We recommend using Incremental.coef_.
    2. State no longer leaks between successive fit calls. Note that Incremental.partial_fit() is still available if you want state, like learned attributes or random seeds, to be re-used. This is useful if you’re making multiple passes over the training data.
  • Changed get_params and set_params for dask_ml.wrappers.Incremental to no longer magically get / set parameters for the underlying estimator (GH#258). To specify parameters for the underlying estimator, use the double-underscore prefix convention established by scikit-learn:

    inc.set_params('estimator__alpha': 10)


Dask-SearchCV is now being developed in the dask/dask-ml repository. Users who previously installed dask-searchcv should now just install dask-ml.

Bug Fixes

  • Fixed random seed generation on 32-bit platforms (GH#230)

Version 0.6.0

API Breaking Changes


Version 0.5.0

API Breaking Changes

Bug Fixes

  • dask_ml.preprocessing.StandardScalar now works on DataFrame inputs (GH#157).

Version 0.4.1

This release added several new estimators.


Added dask_ml.preprocessing.RobustScaler

Scale features using statistics that are robust to outliers. This mirrors sklearn.preprocessing.RobustScalar (GH#62).

Added dask_ml.preprocessing.OrdinalEncoder

Encodes categorical features as ordinal, in one ordered feature (GH#119).

Added dask_ml.wrappers.ParallelPostFit

A meta-estimator for fitting with any scikit-learn estimator, but post-processing (predict, transform, etc.) in parallel on dask arrays. See Parallel Meta-estimators for more (GH#132).

Version 0.4.0

API Changes

  • Changed the arguments of the dask-glm based estimators in dask_glm.linear_model to match scikit-learn’s API (GH#94).

    • To specify lambuh use C = 1.0 / lambduh (the default of 1.0 is unchanged)
    • The rho, over_relax, abstol and reltol arguments have been removed. Provide them in solver_kwargs instead.

    This affects the LinearRegression, LogisticRegression and PoissonRegression estimators.


  • Accept dask.dataframe for dask-glm based estimators (GH#84).

Version 0.3.2


  • Added dask_ml.preprocessing.TruncatedSVD() and dask_ml.preprocessing.PCA() (GH#78)

Version 0.3.0


  • Added KMeans.predict() (GH#83)

API Changes

  • Changed the fitted attributes on MinMaxScaler and StandardScaler to be concrete NumPy or pandas objects, rather than persisted dask objects (GH#75).