Yu Zhang AaronZ345

Hi there 👋

I am a PhD candidate at the College of Computer Science and Technology, Zhejiang University (浙江大学计算机学院).

I work on the Audio Research Team at Zhejiang University, under the supervision of Prof. Zhou Zhao (赵洲). Previously, I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院), with dual bachelor's degrees in Computer Science and Automation. I have also served as a visiting scholar at University of Rochester with Prof. Zhiyao Duan and University of Massachusetts Amherst with Prof. Przemyslaw Grabowicz.

My research interests primarily focus on Multi-Modal Generative AI, specifically in Spatial Audio, Music, Singing, and Speech. I have published first-author papers at top international AI conferences, including NeurIPS, ACL, AAAI, and EMNLP. Currently, I am working on spatial audio generation with multimodal prompts and streaming voice conversion.

I am actively seeking research collaborations. Please feel free to contact me via email at [email protected].

📎 Homepages

Personal Pages: https://aaronz345.github.io (updated recently🔥)
Linkedin: www.linkedin.com/in/yuzhang34
Google Scholar: https://scholar.google.com/citations?user=kA9A6LsAAAAJ
DBLP: https://dblp.org/pid/50/671-126.html

💻 Research Papers

*denotes co-first authors

💬 Speech Synthesis

Preprint MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis, Ziyue Jiang, Yi Ren, Ruiqi Li, Shengpeng Ji, Zhenhui Ye, Chen Zhang, Bai Jionghao, Xiaoda Yang, Jialong Zuo, Yu Zhang, et al.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yu Zhang AaronZ345

Achievements

Achievements

Block or report AaronZ345

Hi there 👋

📎 Homepages

💻 Research Papers

🔊 Spatial Audio

🎼 Music Generation

🎙️ Singing Voice Synthesis

💬 Speech Synthesis

Pinned Loading

Uh oh!