dask_ml.linear_model
.LinearRegression¶
-
class
dask_ml.linear_model.
LinearRegression
(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1.0, class_weight=None, random_state=None, solver='admm', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1, solver_kwargs=None)¶ Esimator for linear regression.
Parameters: - penalty : str or Regularizer, default ‘l2’
Regularizer to use. Only relevant for the ‘admm’, ‘lbfgs’ and ‘proximal_grad’ solvers.
For string values, only ‘l1’ or ‘l2’ are valid.
- dual : bool
Ignored
- tol : float, default 1e-4
The tolerance for convergence.
- C : float
Regularization strength. Note that
dask-glm
solvers use the parameterization \(\lambda = 1 / C\)- fit_intercept : bool, default True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
- intercept_scaling : bool
Ignored
- class_weight : dict or ‘balanced’
Ignored
- random_state : int, RandomState, or None
The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. Used when solver == ‘sag’ or ‘liblinear’.
- solver : {‘admm’, ‘gradient_descent’, ‘newton’, ‘lbfgs’, ‘proximal_grad’}
Solver to use. See Algorithms for details
- max_iter : int, default 100
Maximum number of iterations taken for the solvers to converge.
- multi_class : str, default ‘ovr’
Ignored. Multiclass solvers not currently supported.
- verbose : int, default 0
Ignored
- warm_start : bool, default False
Ignored
- n_jobs : int, default 1
Ignored
- solver_kwargs : dict, optional, default None
Extra keyword arguments to pass through to the solver.
Attributes: - coef_ : array, shape (n_classes, n_features)
The learned value for the model’s coefficients
- intercept_ : float of None
The learned value for the intercept, if one was added to the model
Examples
>>> from dask_glm.datasets import make_regression >>> X, y = make_regression() >>> lr = LinearRegression() >>> lr.fit(X, y) >>> lr.predict(X) >>> lr.predict(X) >>> lr.score(X, y)
Methods
fit
(X[, y])Fit the model on the training data get_params
([deep])Get parameters for this estimator. predict
(X)Predict values for samples in X. score
(X, y)Returns the coefficient of determination R^2 of the prediction. set_params
(**params)Set the parameters of this estimator. -
__init__
(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1.0, class_weight=None, random_state=None, solver='admm', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1, solver_kwargs=None)¶ Initialize self. See help(type(self)) for accurate signature.
-
family
¶ The family this estimator is for.
-
fit
(X, y=None)¶ Fit the model on the training data
Parameters: - X: array-like, shape (n_samples, n_features)
- y : array-like, shape (n_samples,)
Returns: - self : objectj
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: - deep : bool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
predict
(X)¶ Predict values for samples in X.
Parameters: - X : array-like, shape = [n_samples, n_features]
Returns: - C : array, shape = [n_samples,]
Predicted value for each sample
-
score
(X, y)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: - X : array-like, shape = (n_samples, n_features)
Test samples.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True values for X.
Returns: - score : float
R^2 of self.predict(X) wrt. y.
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Parameters: - **params : dict
Estimator parameters.
Returns: - self : object
Estimator instance.