courses:wshop:topics:tematy2022zima

Tematy projektów WSHOP -- zima 2022

Tematy projektów WSHOP -- zima 2022

[KKT] Loki on triplestore

Student:
Namespace in the wiki: FIXME
The goal of the project: Get out of the SWI-Prolog from the Loki. Review current graph bases engines (triplestores), select the most promising one and move the whole knowledge to the selected triplestore.
Technology: PHP, Semantic Web
Description: Semantic wiki Loki is a DokuWiki system with a set of plugins for putting knowledge within wiki pages. It can be then easily extracted/processed. Now, the whole knowledge processing is done via plain files and SWI-Prolog, which is interesting approach, but not the optimal one, as there are many triplestores (dedicated graph base engines) available.
Links:

[KKT] How do semantic catalogues of manuscript collections work?

Student:
Namespace in the wiki: FIXME
The goal of the project: Exploration of existing semantic catalogues for the manuscript collections
Technology: Semantic Web (RDF, SPARQL, ontologies), scientific papers reading/mail writing
Description: A survey of the systems used for the storage and management of manuscripts should be prepared (use the links below as a starting point) and then compared by presenting the content/assumptions/purpose, the semantic model underpinning the project and the practical possibilities for inferring/filtering/displaying collection items.
There will probably be no way to get to the details of the catalogue directly from the website, so one will need to look for scientific papers describing the systems, and if this still does not provide an answer then the people responsible for these projects should be contacted.
Links:
- Kalliope [DE]: https://kalliope.staatsbibliothek-berlin.de/en/index.html
- Fibula [PL]: http://info.filg.uj.edu.pl/fibula/en
- Goethe- und Schiller-Archiv [DE]: https://ores.klassik-stiftung.de/ords/f?p=401:1::::::
- Editionenportal [DE; still beta]: http://editionenportal.de/
- Wittgenstein Ontology Explorer [NO]: http://wab.uib.no/sfb/
- Wittgenstein Source [NO]: http://www.wittgensteinsource.org/

[KKT] FACE APIs comparison

Student: Kacper Panczykowski
Namespace in the wiki: faceapis
The goal of the project: Comparison of the effectiveness of off-the-shelf APIs for emotion recognition in non-trivial images
Technology: Any scripting/programming language (to call the APIs)
Description: The facial expression recognition tools are trained and evaluated on benchmark datasets that contain many expressions generated 'at the request' of the expressor and photographed en face. This does not match the reality, where expressions are not so strong and where the face is not always facing the camera. The project should: (a) identify a catalogue of situations that may frequently arise when interacting with camera-based systems (e.g., tilt/turn of the head, various quality and resolution of the image), (b) prepare a database of images expressing facial expressions in a natural (non-forced) way in different situations, © identify existing APIs for recognising emotions from facial expressions, (d) evaluate the found APIs on the prepared set, (e) summarise the results by indicating the strengths and weaknesses (supported situations) for each API.
Links:
- AffectNet: https://paperswithcode.com/dataset/affectnet (a benchmark dataset)
- Our previous work is briefly summarized in the paper: Evaluation of Selected APIs for Emotion Recognition from Facial Expressions

[KKT] Meta-classification for FACE APIs

Student: Łukasz Wójcik
Namespace in the wiki: metafaceapis
The goal of the project: Preparation of meta-classificator for results from face recognition API
Technology: Python, machine learning
Description: There are off-the-shelf tools for recognising emotions from facial images (e.g. provided by Microsoft as part of Azure Cognitive Services). Unfortunately - trained on benchmark datasets - they do not work properly 'in the real world', where no one sitting in front of a computer has such a broad smile or such a definite expression of sadness. The aim of the project will be to explore this topic - is it possible to improve these existing models/services (appropriate pre-processing of input/post-processing of output) to make them more effective?
Links:
- BIRAFFE2 Dataset: https://doi.org/10.5281/zenodo.3865859
- Dataset description: https://www.nature.com/articles/s41597-022-01402-6

[SBK] Local uncertain explanations on Images

Student:
Namespace in the wiki: FIXME
The goal of the project: Add image modality to the explanation mechanism based on LUX
Technology: Python, Explainable AI, DeepNeuralNetworks,
Description: Explainable ai aim in providing human-readable explanations to model decisions (see more: Interpretable Machine Learning). Local uncertain explanations is a mechanism that generates rules that explains model decisions. It now only works for tabular data, and the goal of the project is to move it to image modality. In particular following milestones are expected to be accomplished:
- Use Zenit-CRP or GradCamto obtain concepts from image that will serve as conditional parts in the rule (you can use the VGG dataset that is given in the tutorial)
- Extract this concepts from image and based on the concepts IDs train LUX to explain different classes
- Present in visual rule-based form the explanations that will possibly allow to spot some biases in the dataset (like error in predicting garbage truck based on stop sign that appeared in the background most often)

Links:
- LUX: https://github.com/sbobek/lux

[SBK] Lux on Time-series

Student:
Namespace in the wiki: FIXME
The goal of the project: Add image modality to the explanation mechanism based on LUX
Technology: Python, Explainable AI, DeepNeurralNetworks, AnomalyDetection,
Description: The goal of the project is explain Deep neural classifier in a rule-based manner, similarly to previous project. We will be working with real data from scientific project that our team is working on (See XPM). Following milestones will be required to accomplish the project:
- Use autoencoder delivered by our team (later in the semester use the auto-encoder developed by other student-team) to detect anomalies (dataset that we will be working with: Metro de Porto
- Build classifier (based on DNN) to predict the anomalies tagged by the AE.
- Use Grad-CAM to explain the classifier (obtain segments that are meaningful to detect anomaly)
- Train LUX on the segments defined by GRAD-Cam heat-maps:

Links:
- LUX: https://github.com/sbobek/lux
- Dataset: Metro de Porto

[SBK] Lux comparison

Student:
Namespace in the wiki: FIXME
The goal of the project: Comaprison of LUX wrt. various XAI metrics
Technology: Python, Explainable AI
Description: The goal of the project is to provide comparative study on the performance of LUX with respect to other explainers in terms of stability, consistency, ETC. There are several milestones required to accomplish the project:
- Use InXAI, or other framework that allows to measure quality of explanations and perform study with multiple variations of hyperparameters on LUX, Anchor, LORE, Shap, LIME, and other methods that can be easily run from: Survey on XAI. You can start form: this notebook

Links:
- LUX: https://github.com/sbobek/lux

[SBK] DeepProbLog counterfactual explanations

Student: Adrian Domagała
Namespace in the wiki: problogcounterfactuals
The goal of the project: Create Counterfactual explanations of DeepNeuralNetworks with usage of probabilistic declarative programming
Technology: DeepProbLog, Python
Description: The goal of the project is to create counterfactual explanations (i.e. What I need to change in the input of the instance to change the prediction of the classifier to desired output. For instance, if the system predicts I cannot get a loan, I want to know what should I do (how change my customer profile) to get a loan. There are several methods fro that, but they most often are “blond” searchers. Therefore you may get a suggestion to change your gender, or your date of birth. To prevent that, we want to modulate the action the counterfactual search can perform in order to find the best one. The goal is to use probabilistic programming, where we can define a meta-procedure of generating counterfactual and allow the system to induce the concrete procedure itself (similar to inductive programming).

Links:
- DeepProbLog
- Dataset: Lets start with simple one (like Titanic, MNIST, etc.)

[SBK] OpenML dataset creation script for Meta-Learning

Student:
Namespace in the wiki: FIXME
The goal of the project: Prepare a script that will build meta-learnign dataset out of OpenML logs
Technology: Python, OpenML API
Description: The main goal of the project is to create a script that will fetch all of the runs/pipelines and dataset from OpenML platform and create a dataset out of it. The challenge is to transform pipeline definitions which are code snippets into logical components of machine-learning pipeline (including deep neural networks). Such a dataset will serve as a learn-to-learn dataset for meta-learning solutions.

Links:
- OpenML

[SBK] Autoencoders fro anomaly detection

Student: Illia Andrieiev
Namespace in the wiki: anomalydetection
The goal of the project: Build auto-encoder for time series that will successfully predict anomalies
Technology: Python, PyTorch, TSAI
Description: The goal of the project is to build auto-encoder and test it on three real datasets. The goal is to detect failures in the machinery by analyzing reconstruction error of auto-encoder (when failure occurs, the reconstruction error is high due to the abnormal characteristics of the failure/anomaly).
Links:
- Dataset: Metro de Porto
- More dataset available later (non public resources)
- Tutorial: Google colab and video: https://youtu.be/yOZftfaPI84
- More resources on data-driven predictive maintanance:
  - Lecture 1, by Moamar Sayed Mouchaweh: https://www.youtube.com/watch?v=P9w6PZlRRNw
  - Lecture 2, by Joao Gama:
    - https://youtu.be/cW_4vrP5JyE (18m)
    - https://youtu.be/9I5H7YkC5Kk (42m)
    - https://youtu.be/rCeg4fg6iqI (30m)
  - Lecture 2, part one by Rita P. Ribeiro:
    - https://youtu.be/BkXwbsJ-IIo
  - and part two by Slawomir Nowaczyk:
    - https://youtu.be/0VER09dTvMs
  - Lecture 4 by Adrien Bécue:
    - https://youtu.be/SQt0Y5mh0JA
  - Lecture 5 by Olga Fink:
    - https://youtu.be/EwQe1GTVk7Q
  - Lecture 6, part one by Slawomir Nowaczyk:
    - https://youtu.be/UEwsKo2GDKU
    - https://youtu.be/4nLRtf7tewM
  - and part two by Sepideh Pashami:
    - https://youtu.be/sDMWgMY3zMI

[SBK] InXAI for images

Student:
Namespace in the wiki: FIXME
The goal of the project: Contribute to the InXAI framework by extending it with image modality
Technology: Python
Description: The goal of the project is to extend the InXAI framework to include image modality. In particular the milestones required to accomplish the project will be as follows:
- Implement segmentation mechanism for images
- Implement permutation mechanism for segmented image (e.g. replacing segments with average value of pixels, etc.)
- Obtain LIME/SHAP importance for the segments of the image
- Calculate metrics with InXAI (this part is already implemented, but need to be tested with Segmenters and Perturbers for images)

Links:
- InXAI
- You can start with: some-not-entirely-working-preliminary-work

[JOB] Automatic phonetic transcription

Student: Sandra Rudnicka
Namespace in the wiki: phonetics
The goal of the project: produce a language agnostic or multilingual model for narrow phonetic transcription of human speech into International Phonetic Alphabet
Technology: Python, Word2Vec2
Description: find labelled datasets (phonetically transcribed speech; segmented or not), unify phonetic transcription, (segment speech into phones), check/train/improve existing models, implement retraining for expanded IPA symbol sets, data collection and analysis of historical and modern population-scale speech samples,
Links:
- papers introducing the model: Word2Vec2, Word2Vec2Phoneme
- competing (but unmantained) project: Persephone
- datasets: TIMIT, dictionaries (licence agreements are being negotiated)

[KKT] Template

Student:
Namespace in the wiki: FIXME
The goal of the project:
Technology:
Description:
Links:

Table of Contents

Tematy projektów WSHOP -- zima 2022

[KKT] Loki on triplestore

[KKT] How do semantic catalogues of manuscript collections work?

[KKT] FACE APIs comparison

[KKT] Meta-classification for FACE APIs

[SBK] Local uncertain explanations on Images

[SBK] Lux on Time-series

[SBK] Lux comparison

[SBK] DeepProbLog counterfactual explanations

[SBK] OpenML dataset creation script for Meta-Learning

[SBK] Autoencoders fro anomaly detection

[SBK] InXAI for images

[JOB] Automatic phonetic transcription

[KKT] Template