Skip to content

Commit ece6930

Browse files
authored
Update thesis.md with new research project details
1 parent 2b33c76 commit ece6930

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

_pages/thesis.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,13 @@ Reach out if you have questions, using the email above.
7979
V1: NLP for Dialects, Low-resource and Multilinguality
8080
</div>
8181
<div class="accordion-content">
82-
{{ "### **Selected Research Projects**
82+
{{ "
83+
This research vector covers methods and resources for processing dialectal, low-resource, and multilingual language data. It focuses on improving robustness, fairness, and coverage of NLP systems across languages and varieties, including cross-lingual transfer and data-efficient learning.
84+
### **Thesis projects**
85+
- *Computational Dialectology.* Language usage often differs based on sociodemographic background, where linguistic differences based on the geographical origin of the speaker are typically studied in the field of dialectology. While qualitative studies into dialectal differences have yielded valuable insights into language variation, such studies often rely on labor-intensive data collection, annotation, and analysis. As such, computational approaches to dialect differences have emerged as a possible method towards the large-scale study of dialects. For students interested in this project, multiple directions are possible, including (but not limited to): (a) interpretability of what features dialect models rely on for differentiation, (b) creation of (parallel) resources for dialect continua, (c) development of new methods to quantify dialectal or sociolinguistic variation, (d) adapting existing models to better accommodate dialect variation.
86+
References: [Bartelds & Wieling 2022](https://aclanthology.org/2022.naacl-main.273/), Bafna et al. 2025, Shim et al. 2026.
87+
Level: BSc or MSc.
88+
8389
- *NLP for dialectal/non-standard language data*. You are welcome to propose
8490
thesis projects related to processing dialectal or non-standard language data.
8591
Selected example projects are given below. Please include the following information

0 commit comments

Comments
 (0)