Generalized Linear Models

Generalized Linear Models

LinearRegression([penalty, dual, tol, C, ...])

Esimator for linear regression.

LogisticRegression([penalty, dual, tol, C, ...])

Esimator for logistic regression.

PoissonRegression([penalty, dual, tol, C, ...])

Esimator for poisson regression.

Generalized linear models are a broad class of commonly used models. These implementations scale well out to large datasets either on a single machine or distributed cluster. They can be powered by a variety of optimization algorithms and use a variety of regularizers.

These follow the scikit-learn estimator API, and so can be dropped into existing routines like grid search and pipelines, but are implemented externally with new, scalable algorithms and so can consume distributed dask arrays and dataframes rather than just single-machine NumPy and Pandas arrays and dataframes.

Example

In [1]: from dask_ml.linear_model import LogisticRegression

In [2]: from dask_ml.datasets import make_classification

In [3]: X, y = make_classification(chunks=50)

In [4]: lr = LogisticRegression()

In [5]: lr.fit(X, y)
Out[5]: LogisticRegression()

Algorithms

admm(X, y[, regularizer, lamduh, rho, ...])

Alternating Direction Method of Multipliers

gradient_descent(X, y[, max_iter, tol, family])

Michael Grant's implementation of Gradient Descent.

lbfgs(X, y[, regularizer, lamduh, max_iter, ...])

L-BFGS solver using scipy.optimize implementation

newton(X, y[, max_iter, tol, family])

Newton's Method for Logistic Regression.

proximal_grad(X, y[, regularizer, lamduh, ...])

Proximal Gradient Method

Regularizers

ElasticNet([weight])

Elastic net regularization.

L1()

L1 regularization.

L2()

L2 regularization.

Regularizer()

Abstract base class for regularization object.