My primary research interests lie in vision-language models. I always investigate different approaches to designing such models with the emphasis on the following three aspects:
- Capability: design robust, generalizable vision-language models that are data-adaptive and task-versatile
- Efficiency: engineer computational efficient vision-language models with good performance tradeoffs
- Interpretability: develop transparent vision-language models that explain their decision-making process
This is my personal website:
https://yusen-peng.github.io
Additionally, for more details about my research, You can find my CV here:
Yusen_Peng_CV.pdf
You can also find my unofficial transcript so far - I have been maintaining 4.0 GPA:
Transcript.pdf