Postprocessing, multiple comparisons¶
- class utils.postprocessing.aggregated_metrics(feature_dominance: bool = True, acc_err: bool = True, time: bool = True)¶
Bases:
object- __init__(feature_dominance: bool = True, acc_err: bool = True, time: bool = True)¶
Gathers feature order and priority, accuracy, and error over multiple AHFS runs from logs, saving results in figures.
- Parameters:
feature_dominance (bool) – If True, calculates feature dominance and related score. Default value is True.
acc_err (bool) – If True, calculates accuracy and error mean. Default value is True.
time (bool) – If True, aggregates evaluation and total runtime. Default value is True.
- compare_two(set_a: dict[str, dict | str], set_b: dict[str, dict | str], feat_count: dict[str, int], save_path: str) None¶
Compares two dictionaries of AHFS evaluations, plots the differences and saves them as figures. Only feature scoring and loss values are compared.
- Parameters:
set_a (dict[str, dict | str]) – Dictionary containing all_selected, all_err, and label keys, latter being a string. Former two values should follow the format of variables returned by read_mat_logs or read_txt_logs.
set_b (dict[str, dict | str]) – See set_a. Ensure identical keys within the set dictionaries for dataset naming.
feat_count (dict[str, int]) – Number of selected features for all datasets.
save_path (str) – Directory where all figures should be saved.
- Returns:
None
- read_mat_logs(superdir: str)¶
File naming format is {dataset_name}-{log_type}_{arbitrary_text}.mat. Recognized log types are: E, fo, RR. RR should have no separator and arbitrary text before it and the file extension.
- Parameters:
superdir (str) – A directory containing all .mat log files.
- Returns:
A tuple of dictionaries of all per-iteration selected features, per-iteration accuracy, per-iteration error, and feature count for each dataset.
- Return type:
tuple[dict[str, np.ndarray], dict[str, np.ndarray], dict[str, np.ndarray], dict[str, int]]
- read_txt_logs(superdir: str)¶
Format of directory naming should follow convention set in logs.py.
- Parameters:
superdir (str) – Path of directory which contains subdirectories, each being a different AHFS run.
- Returns:
If feature_dominance and acc_err is True: a tuple containing dictionaries of all per-iteration selected features, per-iteration accuracy, per-iteration error, and feature count for each dataset. If only acc_err is True: a tuple containing dictionaries of per-iteration accuracy, per-iteration error, and feature count for each dataset. If only feature_dominance is True: a tuple containing dictionaries of all per-iteration selected features and feature count for each dataset.
- run(feat_count: dict[str, int], save_path: str, all_selected: ndarray | None = None, all_acc: ndarray | None = None, all_err: ndarray | None = None) None¶
Plots all results obtained and saves figures at the specified path.
- Parameters:
feat_count (dict[str, int]) – Number of selected features for all datasets.
save_path (str) – Directory where all figures should be saved.
all_selected (np.ndarray | None) – Dictionary of per-iteration selected features for each dataset. If feature_dominance is True, an error is thrown if this variable is None. Default value is None.
all_acc (np.ndarray | None) – Dictionary of per-iteration accuracy for each dataset. If acc_err is True, an error is thrown if this variable is None. Default value is None.
all_err (np.ndarray | None) – Dictionary of per-iteration loss for each dataset. If acc_err is True, an error is thrown if this variable is None. Default value is None.
- Returns:
None