Preset configurations¶
- class utils.presets.AbaloneDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.BankDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.CalculatedCuttingFcDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.CalculatedCuttingPDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.CalculatedCuttingRaDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.CalculatedCuttingTDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.CarDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.CommCrimeDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.ForestFiresDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.HousingDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.IrisDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.MNISTTrain¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.MeasuredCuttingFcDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.MeasuredCuttingPDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.MeasuredCuttingRaDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.MeasuredCuttingTDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.MitbihTest¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.ParkinsonsTelemonitoringMotorDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.ParkinsonsTelemonitoringTotalDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.ReplicatedParkinsonDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.SonarDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.SuperConductDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.WineDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.WineQualityRedDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.WisconsinBreastCancerDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.
- class utils.presets.YearPredDataset¶
Bases:
LoaderBase- __init__()¶
Base class for dataset processing and AHFS running. Only CSV files are supported.
- Parameters:
name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.
- run(**kwargs) tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶
Function for running an AHFS instance on the loaded dataset.
- Parameters:
kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int
- Returns:
A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.