Skip to content

Commit 1270d71

Browse files
authored
Update thesis.md
1 parent 32da09c commit 1270d71

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

_pages/thesis.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -465,7 +465,7 @@ lower scope in the domain of ATS are also possible).
465465
**Level: BSc or MSc.**
466466

467467
- *Analyzing shortcut learning in VLMs across NLI and visual entailment:* Vision-language models (VLMs) achieve strong performance on many tasks, yet they can exhibit shortcut learning, where predictions rely on simple input patterns rather than on a full use of the available evidence. For LLMs, this behavior has been observed in NLI, which asks whether a hypothesis follows from a given premise. Prior work has shown that models can often solve NLI by relying mainly on cues in the hypothesis, without fully capturing the relationship between the premise and the hypothesis ([Poliak et al., 2018](https://aclanthology.org/S18-2023/), [Yuan et al., 2024](https://aclanthology.org/2024.emnlp-main.679/)). In visual entailment, the premise is an image rather than a text ([Xie et al., 2019](https://arxiv.org/abs/1901.06706), [Kayser et al., 2021](https://openaccess.thecvf.com/content/ICCV2021/papers/Kayser_E-ViL_A_Dataset_and_Benchmark_for_Natural_Language_Explanations_in_ICCV_2021_paper.pdf)). The goal of this project is to investigate whether similar shortcut behavior occurs in vision-language models when performing visual entailment, and to analyze which visual and textual information models rely on when making inferences. The scope can be adjusted for BSc or MSc, for example by varying the number of models, prompting strategies, or the depth and types of cues analyzed.
468-
**Level: MSc.**
468+
**Level: BSc or MSc.**
469469

470470
- ~~*Data Mining and LLM-as-a-Judge to better understand LLM behavior:*~~ While the behavior of LLMs and their nuanced and complex output data is challenging to evaluate, data mining approaches can be leveraged to explain model behavior, to bring structure into evaluation and to gain new insights, e.g. on cultural biases or task failure [1]. In this thesis project, we want to take this approach further by evaluating the use of newly proposed data mining algorithms and/or the combination of LLM-as-a-Judge with data mining processes. The project offers the possibility to work on a technical evaluation of methods as well as develop and evaluate a new method. **References:** [1] [https://aclanthology.org/2025.acl-long.985/](https://aclanthology.org/2025.acl-long.985/)
471471
**Level: MSc.**

0 commit comments

Comments
 (0)