courses:wshop:topics:tematy2025wiosna

Tematy projektów WSHOP -- wiosna 2025/2026

  • : Possibility of extending it to master thesis
  • : Quick project
  • : Linked to international scientific project
  • : Linked to JU-internal scientific project
  • Student: FIXME
  • Namespace in the wiki: FIXME
  • The goal of the project: FIXME
  • Technology: FIXME
  • Description: FIXME
  • Links:
    • FIXME
  • Student: FIXME
  • Namespace in the wiki: FIXME
  • The goal of the project: How to model and process various (sometimes even contradictory) opinions on the same topic in knowledge graphs
  • Technology: literature studies, prototypes evaluation, RDF/Semantic Web
  • Description: Classically, knowledge bases strive to show one universal truth ( same for machine learning models). Therefore, usually methods in knowledge bases revolve around concepts such as “consistency”, “conflict resolution”, “lack of redundancy”. But reality is different, and there are various parallel opinions on the same facts or artifacts, e.g. a painting exhibited in the Louvre may be understood quite differently by a person educated in Western European culture and a Japanese person - on the one hand they may refer to other cultural symbols in their interpretations, and on the other hand they may not understand something if the creator came from outside of their culture and may need additional contextual information. In a sense, an analogous situation occurs in filter bubbles, where people “locked” in their bubble have a shared body of knowledge that may be specific only to them, and which one needs to know in order to understand the information they are conveying. The goal of the project is to study the available literature on such approaches, prepare a catalog of situations/problems in which such polyvocality can take place, and evaluate prototypes (if they exist)/prepare our own prototypes of such knowledge bases in the form of knowledge graphs.
  • Links:
  • Student: Cezary Zięba, Igor Tyszer
  • Namespace in the wiki: FIXME
  • The goal of the project: Create date unification module
  • Technology: RDF, Python is preferred
  • Description: The date in source documents or databases in the area of cultural heritage can take different forms: July 1, 1818, 05.03.1783, “Wednesday after St. Martin” of the year 1654. In the case of older documents, calendar reforms are additionally involved. The objectives of the project are twofold: (1) to determine whether there are ready-made tools / standards developed for this purpose, (2) to prepare (either from scratch or based on the solutions found) a tool for date unification and evaluate it on the dataset provided by the project supervisor.
  • Student: Jan Zoń, FIXME
  • Namespace in the wiki: chenames
  • The goal of the project: Create an algorithm that for a given string – a name of a person – and a given collection of strings – names of persons – finds a given number of members of the latter that exhibit the greatest resemblance to the former
  • Technology: Python, Keras/PyTorch, LSTM, phonemizer (?)
  • Description: A task like this can be approached using Levenshtein distance as a measure of resemblance. However, the goal is to develop a more sophisticated solution that accounts for the natural tendencies of languages to make specific substitutions, omissions, and extensions or to use some sophisticated methods like LSTM networks. Important part of the task will be to check the state-of-the-art methods.
  • We will evaluate the solution on real-world cases of entity matching across different databases as part of the CHExRISH Flagship Project.
  • Links:
  • Student: Maciej Szymański, Dominika Głowacka
  • Namespace in the wiki: FIXME
  • The goal of the project: Create an incunabula owners catalogue and integrate it with CHExRISH ontology
  • Technology: API programming, RDF/Semantic Web
  • Description: (1) export the metadata from the Jagiellonian Library catalogue, (2) create a knowledge graph schema according to good practices in the domain (to check as a part of the project), (3) transfer all exported metadata into such a knowledge graph, (4) validate with the domain experts, (5) repeat, if needed, (6) integrate the final graph with the CHExRISH ontology
  • Student: Hubert Musiał, Przemysław Zagraniczny
  • Namespace in the wiki: ladies
  • The goal of the project: Create the knowledge base for the exhibition from JU Museum
  • Technology: RDF/Semantic Web
  • Description: (1) export the data and metadata from the Jagiellonian University Museum, (2) create a knowledge graph schema according to good practices in the domain (to check as a part of the project), keep in mind the interoperability with the CHExRISH ontology, (3) transfer all exported metadata into such a knowledge graph, (4) think about the simple UI (prob. made with existing tools, like Omeka S or Sante) that shows the data (artifacts), metadata (in knowledge graph), and probably some free text from the exhibition catalogue, (5) validate with the domain experts, (6) repeat, if needed
  • Links:
  • Student: Jan Zioło, FIXME
  • Namespace in the wiki: FIXME
  • The goal of the project: To develop a prototype of AI-based methods/tools for reconstructing book collections from source documents
  • Technology: Any useful, but Python is preferred
  • Description: The method is described in detail in the PhD dissertation, which will be shared with the project team (along with access to the author of the dissertation, who will clarify all doubts and share all data). As part of the project, it is necessary to consider which steps and with which AI methods/techniques can currently be automated, and then conduct a pilot implementation repeating the steps performed manually so far (based on the examples in the aforementioned dissertation)
  • Student: FIXME
  • Namespace in the wiki: FIXME
  • The goal of the project: Plan, perform and evaluate selected scenarios of social network analysis in cultural heritage domain
  • Technology: Any, but Python is preferred
  • Description: During the project, we will consider what interesting and non-trivial insights can be drawn using social network analysis methods from graphs describing cultural heritage. The project will consist of both conceptual work (literature review, brainstorming of ideas) and implementation work (prototype analyses for selected scenarios). The project will be carried out in cooperation with CHExRISH project team members and a foreign expert.
  • Links:
  • Student: Kamil Mróz, Paweł Gębala, FIXME
  • Namespace in the wiki: FIXME
  • The goal of the project: Be part of the team running the BIRAFFE3 experiment
  • Technology: Python, data science
  • Description: The project has three phases. During the first phase, you will take part in final fixes during the pilot study (March-April 2025). Then, in the second part, you will help with conducting the actual experiment (April-June 2025; scope of work to be determined). Finally, you will perform some preliminary analyses on the actual data collected in the experiment (May-June 2025; scope of work to be determined). If the work is done with appropriate involvement, you could also become a co-author of a publication on BIRAFFE3 in Nature Scientific Data.
  • Links:
  • Student: Mateusz Matias, Tymoteusz Boba
  • Namespace in the wiki: VKchallenge
  • The goal of the project: Tune and prepare existing code for submission in an open challenge
  • Technology: Python, spaCy, LightGBM, scikit-learn
  • Description: “Subtask 1: Given a (potentially obfuscated) text, decide whether it was written by a human or an AI. Subtask 2: Given a document collaboratively authored by human and AI, classify the extent to which the model assisted. … Participants will submit their systems as Docker images through the Tira platform. It is not expected that submitted systems are actually trained on Tira, but they must be standalone and runnable on the platform without requiring contact to the outside world (evaluation runs will be sandboxed). The submitted software must be executable inside the container via a command line call. … Important dates: May 23, 2025.
  • Links:
  • Student: FIXME
  • Namespace in the wiki: openmlds
  • The goal of the project: Prepare a script that will build meta-learnign dataset out of OpenML logs
  • Technology: Python, OpenML API
  • Description: The main goal of the project is to create a script that will fetch all of the runs/pipelines and dataset from OpenML platform and create a dataset out of it. The challenge is to transform pipeline definitions which are code snippets into logical components of machine-learning pipeline (including deep neural networks). Such a dataset will serve as a learn-to-learn dataset for meta-learning solutions.

  • The goal of the project: Prepare Google Colab Notebook demonstrating and comparing explanations for generative models
  • Technology: Python
  • Description: The main goal is to start from Diffusers-Interpret and test its capabilities on various datasets, and diffusers. Prepare a Notebook in a tutorial-like style, with code that will generate the explanations for selected models and datasets, and comment on the results.
  • Links:
  • Student:
  • Namespace in the wiki: Wacławik Paweł,
  • The goal of the project: Creating explainable hyperparameter optimization for the process-level optimization task
  • Technology: Python, Keras/PyTorch, SHAP
  • Description: AutoML hyperparameter optimization is the process of automatically tuning the hyperparameters of machine learning models to improve performance without manual intervention. It leverages techniques like grid search, random search, and Bayesian optimization to efficiently explore the hyperparameter space. By automating this process, AutoML reduces the time and expertise needed to find optimal model configurations, making machine learning more accessible and effective. In this project we aim in testing existing solutions. Our goal is to add as hyperparameters not only the model parameters, but also additional components form the pipeline, like size of the dataset, number of labelled instance, time constraints, etc.
  • Links:
  • Student:
  • Namespace in the wiki: Kachnic Bartłomiej,Wójcik Maciej
  • The goal of the project: Prepare a Notebook that will test semi-supervised anomaly detection method on MVTec dataset
  • Technology: Python, Keras/PyTorch, SHAP
  • Description: We have developped a method for semi-supervised anomaly detection for images and we need more tests performed on benchmark datasets. The main goal would be to adapt existing code to ne dataset and perform analysis of the results.
  • Links:
    • Source code of the method to tests will be made accessible on demand.
    • We will work with MVTec dataset: MVTec and anomalib library
  • Student:
  • Namespace in the wiki: Chmura Jakub i Aleksandra Stępień,
  • The goal of the project: Prepare a Notebook that will demonstrate usage of a selected multimodal models for time-series classification or regression.
  • Technology: Python, Keras/PyTorch, SHAP
  • Description: The goal is to evaluate if there are available pretrained models similar to LLAvA, Phi3 and others for the time-series, that can link textual and time-series data into one mebedding space.
  • Links:
  • courses/wshop/topics/tematy2025wiosna.txt
  • Last modified: 7 days ago
  • by jeremi