Postprocessing, multiple comparisons¶

class utils.postprocessing.aggregated_metrics(feature_dominance: bool = True, acc_err: bool = True, time: bool = True)¶

Bases: object

__init__(feature_dominance: bool = True, acc_err: bool = True, time: bool = True)¶

Gathers feature order and priority, accuracy, and error over multiple AHFS runs from logs, saving results in figures.

Parameters:

feature_dominance (bool) – If True, calculates feature dominance and related score. Default value is True.
acc_err (bool) – If True, calculates accuracy and error mean. Default value is True.
time (bool) – If True, aggregates evaluation and total runtime. Default value is True.

compare_two(set_a: dict[str, dict | str], set_b: dict[str, dict | str], feat_count: dict[str, int], save_path: str) → None¶

Compares two dictionaries of AHFS evaluations, plots the differences and saves them as figures. Only feature scoring and loss values are compared.

Parameters:

set_a (dict[str, dict | str]) – Dictionary containing all_selected, all_err, and label keys, latter being a string. Former two values should follow the format of variables returned by read_mat_logs or read_txt_logs.
set_b (dict[str, dict | str]) – See set_a. Ensure identical keys within the set dictionaries for dataset naming.
feat_count (dict[str, int]) – Number of selected features for all datasets.
save_path (str) – Directory where all figures should be saved.

Returns:

None

read_mat_logs(superdir: str)¶

File naming format is {dataset_name}-{log_type}_{arbitrary_text}.mat. Recognized log types are: E, fo, RR. RR should have no separator and arbitrary text before it and the file extension.

Parameters:: superdir (str) – A directory containing all .mat log files.
Returns:: A tuple of dictionaries of all per-iteration selected features, per-iteration accuracy, per-iteration error, and feature count for each dataset.
Return type:: tuple[dict[str, np.ndarray], dict[str, np.ndarray], dict[str, np.ndarray], dict[str, int]]

read_txt_logs(superdir: str)¶

Format of directory naming should follow convention set in logs.py.

Parameters:: superdir (str) – Path of directory which contains subdirectories, each being a different AHFS run.
Returns:: If feature_dominance and acc_err is True: a tuple containing dictionaries of all per-iteration selected features, per-iteration accuracy, per-iteration error, and feature count for each dataset. If only acc_err is True: a tuple containing dictionaries of per-iteration accuracy, per-iteration error, and feature count for each dataset. If only feature_dominance is True: a tuple containing dictionaries of all per-iteration selected features and feature count for each dataset.

run(feat_count: dict[str, int], save_path: str, all_selected: ndarray | None = None, all_acc: ndarray | None = None, all_err: ndarray | None = None) → None¶

Plots all results obtained and saves figures at the specified path.

Parameters:

feat_count (dict[str, int]) – Number of selected features for all datasets.
save_path (str) – Directory where all figures should be saved.
all_selected (np.ndarray | None) – Dictionary of per-iteration selected features for each dataset. If feature_dominance is True, an error is thrown if this variable is None. Default value is None.
all_acc (np.ndarray | None) – Dictionary of per-iteration accuracy for each dataset. If acc_err is True, an error is thrown if this variable is None. Default value is None.
all_err (np.ndarray | None) – Dictionary of per-iteration loss for each dataset. If acc_err is True, an error is thrown if this variable is None. Default value is None.

Returns:

None

Postprocessing, multiple comparisons¶

Adaptive, Hybrid Feature Selection

Navigation

Related Topics