dask_ml.xgboost.train
dask_ml.xgboost.train¶
- dask_ml.xgboost.train(client, params, data, labels, dmatrix_kwargs={}, evals_result=None, sample_weight=None, **kwargs)¶
Train an XGBoost model on a Dask Cluster
This starts XGBoost on all Dask workers, moves input data to those workers, and then calls
xgboost.train
on the inputs.- Parameters
- client: dask.distributed.Client
- params: dict
Parameters to give to XGBoost (see xgb.Booster.train)
- data: dask array or dask.dataframe
- labels: dask.array or dask.dataframe
- dmatrix_kwargs: Keywords to give to Xgboost DMatrix
- evals_result: dict, optional
Stores the evaluation result history of all the items in the eval_set by mutating evals_result in place.
- sample_weightarray_like, optional
instance weights
- **kwargs: Keywords to give to XGBoost train
See also
Examples
>>> client = Client('scheduler-address:8786') >>> data = dd.read_csv('s3://...') >>> labels = data['outcome'] >>> del data['outcome'] >>> train(client, params, data, labels, **normal_kwargs) <xgboost.core.Booster object at ...>