teacher.datasets#
The teacher.datasets module includes the different databases
used to run experiments with the teacher package.
Available datasets#
This module includes load methods for the following datasets that are included:
Adult: Ron Kohavi, “Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid”, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996 [dataset]
Breast: O. L. Mangasarian and W. H. Wolberg, “Cancer diagnosis via linear programming,” Dept. Comput. Sci., Univ. Wisconsin-Madison, Madison, WI, USA, Tech. Rep., 1990. [dataset]
Compas: J. Skeem and J. Eno Louden, “Assessment of evidence on the quality of the correctional offender management profiling for alternative sanctions (COMPAS),” California Dept. Corrections Rehabilitation, 2007. [dataset]
German: [dataset]
Heloc: [dataset]
Pima: [dataset]
Dataset format#
The different methods return a dict with the following keys:
name :
str, Name of the datasetdf :
pandas.DataFramePandas DataFrame with the original datacolumns :
list, columns of the DataFrameclass_name :
str, name of the class variablepossible_outcomes :
list, the values of the class columntype_features :
dict, the variables grouped by typefeatures_type :
dict, the type of each featurediscrete :
list, columns to be considered to have discrete valuescontinuous :
list, columns to be considered to have continuous valuesidx_features :
dict, column name of each column once arranged in a NumPy arraylabel_encoder :
sklearn.preprocessing.LabelEncoder, label encoder for the discrete valuesX :
numpy.ndarray, all columns except for the classy :
numpy.ndarray, class column
Functions#
load_adult()Loads the adult dataset
load_beer()Loads the beer dataset
load_breast()Loads the breast dataset
load_compas()Loads the compas dataset
load_heloc()Loads the heloc dataset
load_pima()Loads the pima dataset
- teacher.datasets.load_adult(normalize=False)[source]#
Load and return the adult dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_basket(normalize=False, reduced=False)[source]#
Load and return the basket dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_beer(normalize=False)[source]#
Load and return the beer dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_breast(normalize=False)[source]#
Load and return the breast cancer dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_compas(normalize=False)[source]#
Load and return the COMPAS scores dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_flavia(normalize=False)[source]#
Load and return the FLAVIA dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_german(normalize=False)[source]#
Load and return the german credit dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_heloc(normalize=False)[source]#
Load and return the HELOC dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_iris(normalize=False)[source]#
Load and return the iris dataset.
- Returns:
dataset
- Return type:
- teacher.datasets.load_phishing(normalize=False)[source]#
Load and return the phishing dataset.
- Returns:
dataset
- Return type: