AHFS class¶

class ahfs_class.ahfs.AHFS(k: int, data_bin: int = 5, target_bin: int = 2, save_precomp: bool = True, save_precomp_path: str | None = None, load_precomp_path: str | None = None, is_in_pipeline: bool = False, verbose: int = 1)¶

__init__(k: int, data_bin: int = 5, target_bin: int = 2, save_precomp: bool = True, save_precomp_path: str | None = None, load_precomp_path: str | None = None, is_in_pipeline: bool = False, verbose: int = 1)¶

Implements the Adaptive Hybrid Feature Selection (AHFS) algorithm by Viharos et al. https://doi.org/10.1016/j.patcog.2021.107932

Parameters:

k (int) – Number of features to select.
data_bin (int) – How many bins to discretize the dataset into, excluding the target variable. If 0, no discretization is performed. Default value is 5.
target_bin (int) – How many bins to discretize the target into. Effectively sets the number of classes. If 0, no discretization is performed. Default value is 2.
save_precomp (bool) – Whether to save all basic measures computed during the precomputing phase in a binary .npy file. Path is set by save_precomp_path. Default value is True.
save_precomp_path (str | None) – If save_precomp is True, sets the path for saving the precomputed basic measures. If this variable’s value is None, saves into the current directory with filename format “measures_{time.time_ns()}.npy”. Default value is None.
load_precomp_path (str | None) – Path to load the precomputed basic measures from, skipping the precomputing phase. File must be a binary numpy file. If None, precomputing is not skipped. Default value is None.
is_in_pipeline (bool) – Set to True if the algorithm is intended to be part of a scikit-learn pipeline. Changes the return value of transform() to X and y. Default value is False.
verbose (int) – Controls global verbosity, including feature selection measures. 0: no output, 1: feature selection measure execution time, selected feature and metric, and basic algorithm steps are printed, 2: all steps are printed. Default value is 1.

entropies_wrapper(index: tuple[int, int] | tuple[int, int, int]) → list[float, tuple]¶

evaluate() → tuple[int, float | floating, float | floating]¶

Evaluates the selected feature set using a specific evaluator.

Returns:: The newly selected feature, loss, accuracy.
Return type:: tuple[int, float, float]

fit(X: ndarray, y: ndarray) → None¶: Placeholder function to ensure compatibility with scikit-learn interface.

nn_one_fold(candidate: int, train_index: ~numpy.ndarray, test_index: ~numpy.ndarray, nn_layers: list[[<class 'int'>, typing.Optional[typing.Callable[[float], float]]]]) → tuple[int, float, float]¶

One fold of an evaluation. Used for parallel execution of the evaluation phase if CPU is used.

Parameters:

candidate (int) – Candidate feature index.
train_index (np.ndarray) – Row index of train samples.
test_index (np.ndarray) – Row index of test samples.
nn_layers (list[[int, Callable[[float], float]|None], ]) – Layers of the neural network.

Returns:

Candidate index, loss and accuracy associated with the candidate.

Return type:

tuple[int, float, float]

transform(X: ndarray, y: ndarray) → tuple[list[int], list[float], list[float], dict[slice(<class 'str'>, <class 'int'>, None)], list[float], float] | tuple[~numpy.ndarray, ~numpy.ndarray]¶

Applies the Adaptive Hybrid Feature Selection algorithm on the dataset.

Parameters:

X (np.ndarray) – Numpy array holding the data.
y (np.ndarray) – Target vector.

Returns:

If is_in_pipeline was set to False: A list of the selected features, loss of the selected feature set, accuracy of the selected feature set, features selected per iteration. If is_in_pipeline was set to True: The original dataset containing only the selected features, target variable.

AHFS class¶

Adaptive, Hybrid Feature Selection

Navigation

Related Topics