teacher.datasets#

The teacher.datasets module includes the different databases used to run experiments with the teacher package.

Available datasets#

This module includes load methods for the following datasets that are included:

Dataset format#

The different methods return a dict with the following keys:

  • name : str, Name of the dataset

  • df : pandas.DataFrame Pandas DataFrame with the original data

  • columns : list, columns of the DataFrame

  • class_name : str, name of the class variable

  • possible_outcomes : list, the values of the class column

  • type_features : dict, the variables grouped by type

  • features_type : dict, the type of each feature

  • discrete : list, columns to be considered to have discrete values

  • continuous : list, columns to be considered to have continuous values

  • idx_features : dict, column name of each column once arranged in a NumPy array

  • label_encoder : sklearn.preprocessing.LabelEncoder, label encoder for the discrete values

  • X : numpy.ndarray, all columns except for the class

  • y : numpy.ndarray, class column

Functions#

load_adult()

Loads the adult dataset

load_beer()

Loads the beer dataset

load_breast()

Loads the breast dataset

load_compas()

Loads the compas dataset

load_heloc()

Loads the heloc dataset

load_pima()

Loads the pima dataset


teacher.datasets.load_adult(normalize=False)[source]#

Load and return the adult dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_basket(normalize=False, reduced=False)[source]#

Load and return the basket dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_beer(normalize=False)[source]#

Load and return the beer dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_breast(normalize=False)[source]#

Load and return the breast cancer dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_compas(normalize=False)[source]#

Load and return the COMPAS scores dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_flavia(normalize=False)[source]#

Load and return the FLAVIA dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_german(normalize=False)[source]#

Load and return the german credit dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_heloc(normalize=False)[source]#

Load and return the HELOC dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_iris(normalize=False)[source]#

Load and return the iris dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_phishing(normalize=False)[source]#

Load and return the phishing dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_pima(normalize=False)[source]#

Load and return the pima indians dataset.

Returns:

dataset

Return type:

dict

teacher.datasets.load_wine(normalize=False)[source]#

Load and return the wine dataset.

Returns:

dataset

Return type:

dict