dask_ml.datasets.make_counts

dask_ml.datasets.make_counts

dask_ml.datasets.make_counts(n_samples=1000, n_features=100, n_informative=2, scale=1.0, chunks=100, random_state=None)

Generate a dummy dataset for modeling count data.

Parameters
n_samplesint

number of rows in the output array

n_featuresint

number of columns (features) in the output array

n_informativeint

number of features that are correlated with the outcome

scalefloat

Scale the true coefficient array by this

chunksint

Number of rows per dask array block.

random_stateint, RandomState instance or None (default)

Determines random number generation for dataset creation. Pass an int for reproducible output across multiple function calls. See Glossary.

Returns
Xdask.array, size (n_samples, n_features)
ydask.array, size (n_samples,)

array of non-negative integer-valued data

Examples

>>> X, y = make_counts()