Differences

This shows you the differences between two versions of the page.

--- courses:wshop:topics:tematy2023zima [2023/10/06 10:08] – [[JOB] Template] jeremi
+++ courses:wshop:topics:tematy2023zima [2023/10/13 12:37] (current) – [[KKT] Support in BIRAFFE3 experiment] kkt
@@ Line 3: / Line 3: @@
 ==== [KKT] FACE APIs comparison - follow-up study ====
-  * **Student:** FIXME
+  * **Student:** Michał Przysucha
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:faceapis-follow:]]
   * **The goal of the project:** Comparison of the effectiveness of off-the-shelf APIs and pre-trained models for emotion recognition in non-trivial images
   * **Technology:** Python, data analysis
@@ Line 15: / Line 15: @@
 ==== [KKT] Affect changes as probability blobs (with RegFlow) in Valence x Arousal space ====
-  * **Student:** FIXME
+  * **Student:** Konrad Micek
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:regflow:]]
   * **The goal of the project:** Adapt RegFlow method to 2-D emotion prediction task
   * **Technology:** Python, data analysis
@@ Line 37: / Line 37: @@
 ==== [KKT] Support in BIRAFFE3 experiment ====
-  * **Student:** FIXME
+  * **Student:** Honorata Zych
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:bir3support:]]
   * **The goal of the project:** Support in BIRAFFE3 experiment preparation (pilot data analysis) and then collaboration with actual experiment
   * **Technology:** Python, data analysis
@@ Line 47: / Line 47: @@
 ==== [KKT] Emotion recognition for everyday life - evaluation of the state-of-the-art ====
-  * **Student:** FIXME
+  * **Student:** Anastasiya Yurenia
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:emognition:]]
   * **The goal of the project:** Replicate and evaluate methods and tools proposed for emotion recognition by [[https://emognition.com/|Emognition]] team from PWr
   * **Technology:** reading :), Python, data analysis, machine learning
@@ Line 59: / Line 59: @@
-==== [KKT] Loki on the triplestore ====
+==== [KKT] Loki on a triplestore ====
-  * **Student:** FIXME
+  * **Student:** Dominik Tyszownicki
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:loki:]]
   * **The goal of the project:** Get out of the SWI-Prolog from the Loki. Review current graph bases engines (triplestores), select the most promising one and move the whole knowledge to the selected triplestore.
   * **Technology:** PHP, Semantic Web
@@ Line 195: / Line 195: @@
-==== [JKO] Template ====
+==== [JKO] Universal speech recognition with differentiable allophone graphs ====
   * **Student:** FIXME
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:UPhoRMPS:]]
-  * **The goal of the project:** FIXME
+  * **The goal of the project:** Implementing in k2 framework a model that learns phone-to-phoneme mappings.
-  * **Technology:** FIXME
+  * **Technology:** Python, [[https://github.com/k2-fsa/k2|k2]] (Automatic Speech Recognition)
-  * **Description:** FIXME
+  * **Description:** The ultimate goal (larger than this project) is to create a speech recognition (SR) model for low-resource languages. In this project, we would like to implement a method for learning phonetic structures of language (allowed cooccurrence of variants of speech sounds). In speech processing, such constraints were commonly encoded via Weighted finite-state transducers (WFSTs). Recently, people realised that WFSTs can be treated as differentiable and so can be incorporated in neural SR models -- k2 serves exactly that purpose. The goal is to learn this ASR framework, reimplement in it a model by Yan et al., and start experimenting with it on various datasets (like accented English, Polish dialects or languages you haven't even heard of) that we have access to.
   * **Links:**
-    * FIXME
+    * [[https://www.isca-speech.org/archive/interspeech_2021/yan21b_interspeech.html|Yan, B., Dalmia, S., Mortensen, D. R., Metze, F. & Watanabe, S. Differentiable Allophone Graphs for Language-Universal Speech Recognition. in Interspeech 2021 2471–2475 (ISCA, 2021).]]
+    * [[https://www.isca-speech.org/archive/interspeech_2022/laptev22_interspeech.html|Laptev, A., Majumdar, S. & Ginsburg, B. CTC Variations Through New WFST Topologies. in Interspeech 2022 1041–1045 (ISCA, 2022).]]
-==== [JOB] Template ====
+==== [JKO] Domain Adversarial Training for speech model tuning ====
   * **Student:** FIXME
-  * **Namespace in the wiki:** [[..:projects:2023:FIXME:]]
+  * **Namespace in the wiki:** [[..:projects:2023:UPhoRMPS:]]
-  * **The goal of the project:** FIXME
+  * **The goal of the project:** Implementing Domain Adversarial Training (DAT) in k2 framework.
-  * **Technology:** FIXME
+  * **Technology:** Python, [[https://github.com/k2-fsa/k2|k2]] (for Automatic Speech Processing)
-  * **Description:** FIXME
+  * **Description:** The ultimate goal (larger than this project) is to create a speech recognition (SR) model for low-resource languages. k2 is a novel that reformulates training of neural SR models with Weighted finite-state transducers (WFSTs). In this project, we would like to incorporate into this framework DAT (a neural network training scheme that learns to disregard differences between some predefined domains) in order to make SR models more robust to highly differing source and target data (e.g., accents). The implementation should be further exhaustively tested on the Speech Accent Archive dataset that we have obtained.
   * **Links:**
-    * FIXME
+  * [[https://arxiv.org/abs/1806.02786|Sun, S., Yeh, C.-F., Hwang, M.-Y., Ostendorf, M. & Xie, L. Domain Adversarial Training for Accented Speech Recognition. in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 4854–4858 (2018).]]
+  * [[https://accent.gmu.edu/|The speech accent archive]]
+==== [JKO] Physics informed neural networks (PINN) for object tracking in sports ====
+  * **Student:** FIXME
+  * **Namespace in the wiki:**  [[..:projects:2023:FIXME:]]
+  * **The goal of the project:** Paper review and adjusting a chosen PINN for single object traking.
+  * **Technology:** Python, Julia
+  * **Description:** The ultimate goal (larger than this project) is to create a model to infer the movement of table tennis players to analyse their gameplay from single camera videos. One has to merge (a) pose estimation models (including a detailed hand position estimation) with (b) ball trajectory tracking and informing (a) with the physical parameters inferred from (b). The challenge is low sampling (only a couple of frames per one ball shot), blurr, camera angles make depth estimation hard, etc. There are a number of existing implementations of neural networks that explicitly incorporate physical equations/quantities. The goal it to find a suitable one, scrap a small amount of data (we can actually record high quality data ourselves), and try it out. Possibly there are Julia and Python alternatives to be considered.
+  * **Links (as a starting point):**
+  * [[https://arxiv.org/abs/1907.07587|A Differentiable Programming System to Bridge Machine Learning and Scientific Computing.]]
+  * [[https://arxiv.org/abs/2211.07377|Physics-Guided, Physics-Informed, and Physics-Encoded Neural Networks in Scientific Computing]]
 ==== [FIXME] Template ====