Contents

Generalized Linear Models

Generalized Linear Models¶

`LinearRegression`([penalty, dual, tol, C, ...])	Esimator for linear regression.
`LogisticRegression`([penalty, dual, tol, C, ...])	Esimator for logistic regression.
`PoissonRegression`([penalty, dual, tol, C, ...])	Esimator for poisson regression.

Generalized linear models are a broad class of commonly used models. These implementations scale well out to large datasets either on a single machine or distributed cluster. They can be powered by a variety of optimization algorithms and use a variety of regularizers.

These follow the scikit-learn estimator API, and so can be dropped into existing routines like grid search and pipelines, but are implemented externally with new, scalable algorithms and so can consume distributed dask arrays and dataframes rather than just single-machine NumPy and Pandas arrays and dataframes.

Example¶

In [1]: from dask_ml.linear_model import LogisticRegression

In [2]: from dask_ml.datasets import make_classification

In [3]: X, y = make_classification(chunks=50)

In [4]: lr = LogisticRegression()

In [5]: lr.fit(X, y)
Out[5]: LogisticRegression()

Algorithms¶

`admm`(X, y[, regularizer, lamduh, rho, ...])	Alternating Direction Method of Multipliers
`gradient_descent`(X, y[, max_iter, tol, family])	Michael Grant's implementation of Gradient Descent.
`lbfgs`(X, y[, regularizer, lamduh, max_iter, ...])	L-BFGS solver using scipy.optimize implementation
`newton`(X, y[, max_iter, tol, family])	Newton's Method for Logistic Regression.
`proximal_grad`(X, y[, regularizer, lamduh, ...])	Proximal Gradient Method

Regularizers¶

`ElasticNet`([weight])	Elastic net regularization.
`L1`()	L1 regularization.
`L2`()	L2 regularization.
`Regularizer`()	Abstract base class for regularization object.

Pipelines and Composite Estimators

Naive Bayes