Zhuoyan (Terry) Tao

USC · Applied Mathematics & Computer Science

Zhuoyan Terry Tao

Undergraduate researcher working in speech processing and the mathematics of computation. Drawn to problems where signal meets meaning. ✌︎

Curriculum Vitae ↗︎ About me

Kualoa Ranch · Oahu, Hawaii

About Me

Speech Processing · Audio ML · NLP

I'm Zhuoyan Tao — Terry to friends and colleagues. I study Computer Science and Mathematics at USC, with research spanning spoken language understanding, multilingual speech, and audio generation.

My work sits at the boundary of signal processing, machine learning, and linguistics — I'm drawn to the question of how meaning is carried in speech beyond just words.

In August 2026, I'll be joining CMU's Master of Intelligent Information Systems program in the School of Computer Science and Language Technologies Institute. Outside of research, I enjoy exploring new places ☀︎

Education USC · B.S. CS & Math Incoming MIIS at CMU LTI (2026–)

Honors Trustee Scholar Full-Tuition Merit Scholarship, USC

Research Speech processing & audio ML

Publishes as Zhuoyan Tao

Research

USC SAIL Lab · IEEE SPL under review

Multilingual Diarization & Code-Switching

Spoken language diarization for Hindi, English, and Malayalam in code-switching conversations. Built a synthetic data pipeline combining rule-based and LLM-assisted methods for language change-point detection.

CMU WavLab · Interspeech 2026

Pragmatic Intent in Speech Translation

Leading modeling for the Interspeech 2026 S2ST challenge — building HuBERT- and prosody-based systems that preserve stance, emotion, and dialog function in English–Spanish speech translation.

CMU WavLab · First-author in progress

Unified Spoken Language Evaluation

A chain-based multi-metric framework for evaluating spoken language quality — covering MOS, intelligibility, language correctness, and sub-utterance granularity in a single unified system.

CMU WavLab · ICML 2026

Non-Human Singing Voice Synthesis

3rd-author contribution to non-human singing synthesis. Also contributed to ESPnet: ~50× faster speaker-embedding extraction, new TTS recipes, and a Tacotron-1 CBHG fix.

About Me

Research

Experience

Links & Contact