by Radosław Śmigielski
At the start of machine learning examples you will need some example data sets. Scikit-learn provides few good examples, here is more what it provides and how to use it.
Scikit-learn comes with number of existing example data sets you can use when starting with machine learning. This is how you can find them:
>>> from sklearn import datasets
>>> dir(datasets)
You can see three type of functions:
load_<SOME_DATA_SET> - Load local data sets which come with Scikit-learn. These are parts of python sklearn package.
All available:
fetch_<SOME_DATA_SET> - Load external data set, download them from Internet. These are too bit to include them into Python package and some of them include multiple binary files.
All available:
And this is how you can fetch some data:
>>> california_housing = datasets.fetch_california_housing()
Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to /home/radek/scikit_learn_data
This is how you can load the data:
>>> bp = datasets.load_boston()
>>> dir(bp)
['DESCR', 'data', 'feature_names', 'filename', 'target']
>>> type(bp)
<class 'sklearn.utils.Bunch'>
Where sklearn.utils.Bunch is a dictionary-like object that exposes its keys as attributes.
tags: sklearn - machine learning - Python