Programming Assignment I

This is an old revision of the document!

Select one of the assignments below. Assignments marked with can be extended as a continution in Programming Assgnment II. Projects marked with can be (with some additional work and depending on the results) published as scientific papers.

Main goal is to use the tobii-pytracker software and MVTec dataset to design and conduct pilot study on how XAI methods and humans “analyze” images in the task of anomaly detection. Note: you can use eyetracking device emulaiton that is implemented in the tobii-pytracker software, but also try a real hardware if you wqant.

In particular this assumes:

Obtain an MVTec datset and plug it in to the tobii-pytracker
Train anomaly-detecot using patchcore algorithm: see how to do this – its a one-liner.
Get human-eyetracking results and compare with heatmaps that can be obtained from anomalib. Calculate similarities.

Additional extensions:

Use more models than just patchore and XAI methods like SHAP which are model-agnostic
Implement customModel for tobii-pytracker, that will create bounding boxes, cor convex hulls for the detected anomalies at runtime (See CustomModel section).

The goal is to take the ACFX software and test it both from the perspective of online app: Streamplit app and SDK.

Any issue, bug or inconsistency with documentaiton should be reported to the issue tracker.

Possible extensions to a paper/master thesis: how to use thes emethod to other than tabular cases/models, for instance: Graph neural networks, or images?

Testing WinCLIP model for explainable anomaly detection. See if the model can be used to obtain textual explanations of a detrected anomalies.

Find ready to use frameworks that allwos to measure predictive multiplicity – how much two or more models that provide the same accuracy in predicion are able to provide conflicting predictions in areas not covered in trainig set.

You may start by lookint this paper and its references: RashomonGB: Analyzing the Rashomon Effect and Mitigating Predictive Multiplicity in Gradient Boosting

How to use LNN to obtaion explanations of their predictions? For instance how to extract the activated predicates to outrput the rule that resulted in final decission?

ProtoTSNet is a inherently interpretable DNN for time series classificaiton. Inspect its variant which goal is to guide training procedure with manually created prortypes.

What could be done to improve its predictive power?

Try to implement TSProto variant that will work for images too (use segmentation to detect segments), cluster segments to detect prototypes, build a decision tree that explains a decision using this visual prototypes.

Try to run tutorials of TSProto in Colab (it will require some code tweeking/requirements adjustment to work with numpy 2.0.0. Once done, create a pull request to include the changes in the current implementation of TSProto

Benchmark 3 models on 3 datasets (e.g., Breast Cancer, Adult, Heart Disease), explain the best model using (SHAP, Lime, and feature importance), and get counterfactual explanations using one of these frameworks (ACFX, DICE, CLUE, CFNOW) .

Extra: Evaluate the counterfactual explanations.

Train a classifier on a tabular dataset (e.g., Breast Cancer, Adult, Heart Disease), generate counterfactual explanations for selected instances without any constraints using (ACFX, DICE, CLUE, CFNOW) , then regenerate counterfactuals while forbidding changes to sensitive or immutable features (e.g., age, gender, race, marital status, education). Compare both sets of explanations in a before/after table. Describe how the constraints impact the plausibility, fairness, and actionability of the results. Example: “A bank cannot change a customer’s age or marital status, but can suggest financial improvements.”

Extra: Quantify how many counterfactuals become invalid or infeasible when constraints are applied.

Train a model on a dataset with correlated features (e.g., Breast Cancer, Adult, Heart Disease). Visualize the correlation structure, then generate explanations using two XAI methods (e.g., SHAP and LIME). Identify at least one case where the model attributes importance to a feature that is likely acting as a proxy for another correlated variable. Show before/after visualizations and briefly explain whether the explanation is genuinely meaningful or simply an artifact of feature correlation.

Extra: Propose and implement one strategy to reduce misleading explanations (e.g., feature grouping, PCA, domain-level feature merging…).