Preset loader¶

class utils.helpers.loader.LoaderBase(name: str, path: str, target: list[int], header: None | int | list[int] = None, sep: str = ',', drop_columns: None | list[int] = None, drop_rows: None | list[int] = None)¶

Bases: object

__init__(name: str, path: str, target: list[int], header: None | int | list[int] = None, sep: str = ',', drop_columns: None | list[int] = None, drop_rows: None | list[int] = None)¶

Base class for dataset processing and AHFS running. Only CSV files are supported.

Parameters:

name (str) – Name of the instance.
path (str) – Path to the CSV dataset.
target (list[int]) – List of indexes that define target column(s). A list size greater than 1 implies one-hot encoding.
header (None | int | list[int]) – List of indexes that define header row(s). If None, no header is extracted from the data. Default value is None.
sep (str) – Separator character. Default value is “,”.
drop_columns (None | list[int]) – List of indexes that define which column(s) to remove. If None, no removal is done. Default value is None.
drop_rows (None | list[int]) – List of indexes that define which row(s) to remove. If None, no removal is done. Default value is None.

run(**kwargs) → tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)]] | tuple[~numpy.ndarray, ~numpy.ndarray]¶

Function for running an AHFS instance on the loaded dataset.

Parameters:

kwargs – Overrides AHFS run parameters. See below for detailed documentation.
k – Number of features to select. int
data_bin – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5. int
target_bin – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2. int
save_precomp – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True. bool
save_precomp_path – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None. str | None
load_precomp_path – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None. str | None
is_in_pipeline – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False. bool
verbose – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1. int

Returns:

A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration.

save(path: str | None = None) → None¶

Saves the transformed preset dataset into a .csv file.

Parameters:: path (str | None) – Path to save the dataset. If None, saved in the working directory. Default value is None.

Preset loader¶

Adaptive, Hybrid Feature Selection

Navigation

Related Topics