Experience
Applied Scientist Intern
- Proposed an efficient architecture for long-form speech recognition
- Successfully achieved general speech in-context learning capabilities with provided contexts
Research Intern
- Leveraged Diffusion-based model and proposed a novel unfolding training procedure for speech enhancement tasks
- Significantly shrunk the performance gap between probabilistic diffusion model and conventional discriminative models
AIML - ASR Understanding Intern
- Embedding-Matching Acoustic-to-Word ASR
- Exposed limitations of existing embedding-matching acoustic-to-word (A2W) that previous studies did not point out
- Proposed generating multiple embeddings as well as using pronunciation-based embeddings, to make significant accuracy improvements to embedding-matching A2W
Research Assistant
- 2020 Detection and Classification of Acoustic Scenes and Events
- Outperformed baseline score by relative improvement of 19.45% on task 4: Sound Event Detection and Separation in Domestic Environments
- Implemented models and designed training procedure with few labeled data (less than 1600) and over 10000 unlabeled data