Postprocessing, multiple comparisons

class utils.postprocessing.aggregated_metrics(feature_dominance: bool = True, acc_err: bool = True, time: bool = True)

Bases: object

__init__(feature_dominance: bool = True, acc_err: bool = True, time: bool = True)

Gathers feature order and priority, accuracy, and error over multiple AHFS runs from logs, saving results in figures.

Parameters:
  • feature_dominance (bool) – If True, calculates feature dominance and related score. Default value is True.

  • acc_err (bool) – If True, calculates accuracy and error mean. Default value is True.

  • time (bool) – If True, aggregates evaluation and total runtime. Default value is True.

compare_two(set_a: dict[str, dict | str], set_b: dict[str, dict | str], feat_count: dict[str, int], save_path: str) None

Compares two dictionaries of AHFS evaluations, plots the differences and saves them as figures. Only feature scoring and loss values are compared.

Parameters:
  • set_a (dict[str, dict | str]) – Dictionary containing all_selected, all_err, and label keys, latter being a string. Former two values should follow the format of variables returned by read_mat_logs or read_txt_logs.

  • set_b (dict[str, dict | str]) – See set_a. Ensure identical keys within the set dictionaries for dataset naming.

  • feat_count (dict[str, int]) – Number of selected features for all datasets.

  • save_path (str) – Directory where all figures should be saved.

Returns:

None

read_mat_logs(superdir: str)

File naming format is {dataset_name}-{log_type}_{arbitrary_text}.mat. Recognized log types are: E, fo, RR. RR should have no separator and arbitrary text before it and the file extension.

Parameters:

superdir (str) – A directory containing all .mat log files.

Returns:

A tuple of dictionaries of all per-iteration selected features, per-iteration accuracy, per-iteration error, and feature count for each dataset.

Return type:

tuple[dict[str, np.ndarray], dict[str, np.ndarray], dict[str, np.ndarray], dict[str, int]]

read_txt_logs(superdir: str)

Format of directory naming should follow convention set in logs.py.

Parameters:

superdir (str) – Path of directory which contains subdirectories, each being a different AHFS run.

Returns:

If feature_dominance and acc_err is True: a tuple containing dictionaries of all per-iteration selected features, per-iteration accuracy, per-iteration error, and feature count for each dataset. If only acc_err is True: a tuple containing dictionaries of per-iteration accuracy, per-iteration error, and feature count for each dataset. If only feature_dominance is True: a tuple containing dictionaries of all per-iteration selected features and feature count for each dataset.

run(feat_count: dict[str, int], save_path: str, all_selected: ndarray | None = None, all_acc: ndarray | None = None, all_err: ndarray | None = None) None

Plots all results obtained and saves figures at the specified path.

Parameters:
  • feat_count (dict[str, int]) – Number of selected features for all datasets.

  • save_path (str) – Directory where all figures should be saved.

  • all_selected (np.ndarray | None) – Dictionary of per-iteration selected features for each dataset. If feature_dominance is True, an error is thrown if this variable is None. Default value is None.

  • all_acc (np.ndarray | None) – Dictionary of per-iteration accuracy for each dataset. If acc_err is True, an error is thrown if this variable is None. Default value is None.

  • all_err (np.ndarray | None) – Dictionary of per-iteration loss for each dataset. If acc_err is True, an error is thrown if this variable is None. Default value is None.

Returns:

None