Fast Correlation-Based Filter

class feasel.f_FCBF.FastCorrFS(dataset: ndarray, target: ndarray, max_features: int | None = None, threshold: float = 0.01, verbose: int = 0)

Bases: FeaselBase

__init__(dataset: ndarray, target: ndarray, max_features: int | None = None, threshold: float = 0.01, verbose: int = 0)

Implements the Fast Correlation Based Feature Selection found in “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution” by Yu et al. https://www.researchgate.net/publication/221345776_Feature_Selection_for_High-Dimensional_Data_A_Fast_Correlation-Based_Filter_Solution

Parameters:
  • dataset (np.ndarray) – Dataset of size (n_samples, n_features).

  • target (np.ndarray) – Target vector of size (n_samples,).

  • max_features (int) – Maximum number of features to select. If None, returns with all suitable candidates. Default value is None.

  • threshold (float) – Threshold for initial filtering. Default value is 0.01.

  • verbose (int) – Verbosity. 0: no output; 1: prints execution time, selected feature and metric; 2: prints every step. Recommend turning off parallel execution when verbose is 2. Note that increased verbosity affects execution time. Default value is 0.

Variables:
  • _flags – Dictionary of flags describing the capabilities of the algorithm.

  • _measures_used – Set of measure names used in the algorithm.

Returns:

None

Return type:

None

transform() tuple[set[int], ndarray[int], float]

Applies the algorithm.

Returns:

Selected feature index or indices, ordered selection, execution time; tuple of size (3,).

Return type:

tuple[set[int], np.ndarray[int], float]