dask_ml.linear_model.LinearRegression

class dask_ml.linear_model.LinearRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1.0, class_weight=None, random_state=None, solver='admm', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1, solver_kwargs=None)

Esimator for linear regression.

Parameters
penaltystr or Regularizer, default ‘l2’

Regularizer to use. Only relevant for the ‘admm’, ‘lbfgs’ and ‘proximal_grad’ solvers.

For string values, only ‘l1’ or ‘l2’ are valid.

dualbool

Ignored

tolfloat, default 1e-4

The tolerance for convergence.

Cfloat

Regularization strength. Note that dask-glm solvers use the parameterization \(\lambda = 1 / C\)

fit_interceptbool, default True

Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

intercept_scalingbool

Ignored

class_weightdict or ‘balanced’

Ignored

random_stateint, RandomState, or None

The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. Used when solver == ‘sag’ or ‘liblinear’.

solver{‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}

Solver to use. See Algorithms for details

max_iterint, default 100

Maximum number of iterations taken for the solvers to converge.

multi_classstr, default ‘ovr’

Ignored. Multiclass solvers not currently supported.

verboseint, default 0

Ignored

warm_startbool, default False

Ignored

n_jobsint, default 1

Ignored

solver_kwargsdict, optional, default None

Extra keyword arguments to pass through to the solver.

Attributes
coef_array, shape (n_classes, n_features)

The learned value for the model’s coefficients

intercept_float of None

The learned value for the intercept, if one was added to the model

Examples

>>> from dask_glm.datasets import make_regression
>>> X, y = make_regression()
>>> lr = LinearRegression()
>>> lr.fit(X, y)
>>> lr.predict(X)
>>> lr.predict(X)
>>> lr.score(X, y)

Methods

fit(self, X[, y])

Fit the model on the training data

get_params(self[, deep])

Get parameters for this estimator.

predict(self, X)

Predict values for samples in X.

score(self, X, y)

Returns the coefficient of determination R^2 of the prediction.

set_params(self, \*\*params)

Set the parameters of this estimator.

__init__(self, penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1.0, class_weight=None, random_state=None, solver='admm', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1, solver_kwargs=None)

Initialize self. See help(type(self)) for accurate signature.

family

The family this estimator is for.

fit(self, X, y=None)

Fit the model on the training data

Parameters
X: array-like, shape (n_samples, n_features)
yarray-like, shape (n_samples,)
Returns
selfobjectj
get_params(self, deep=True)

Get parameters for this estimator.

Parameters
deepboolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

predict(self, X)

Predict values for samples in X.

Parameters
Xarray-like, shape = [n_samples, n_features]
Returns
Carray, shape = [n_samples,]

Predicted value for each sample

score(self, X, y)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters
Xarray-like, shape = (n_samples, n_features)

Test samples.

yarray-like, shape = (n_samples) or (n_samples, n_outputs)

True values for X.

Returns
scorefloat

R^2 of self.predict(X) wrt. y.

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns
self