About Me

Hi All! I’m Anna Seo Gyeong Choi. My Korean name is 최서경, but you can call me Anna!

I’m currently a 4th-year Ph.D. candidate in Information Science at Cornell University, working with Professor Allison Koenecke. I started out my Ph.D. journey in the Linguistics Department, where I mostly worked with Professor Mats Rooth and Professor Sam Tilsen, and then transferred at the end of my 2nd year to Information Science. I still keep ties with the Linguistics by working for a minor degree. I’m currently affiliated with University of Pennsylvania as a Visiting Scholar at Linguistic Data Consortium, working with Professor Sunghye Cho.

I work in the intersection between AI Ethics and Automatic Speech Recognition, with interests in Human-Computer Interaction in terms of Accessibility for populations with atypical speech. I have two main research interests: (1) Auditing Speech-to-Text algorithms and (2) Using voice as a biomarker in AI for Healthcare. I work with diverse speech datasets to test for algorithmic fairness in ASR models and critique evaluation methods, both in terms of the technical aspect and the social and legal aspect. Because I work with speech data from people with health impairments, I analyze their acoustic features in an attempt to utilize voice as a biometric. I focus on diagnosis of their state in terms of severity, especially focusing on neurodegenerative disorders such as Alzheimer’s Disease, Schizophrenia, and Aphasia, among many others. Combining the two, I’m interested in increasing accessibility for these populations, such as working on transcription output styles.

<aside> 💡 2025 Summer: I will be working as a Speech Science Intern with Rev!

</aside>

<aside> 💡 2024 Summer: I did a Summer Internship at NAVER Cloud in Seoul, South Korea. I worked with NSpeech team on both Speech-to-Text product and research. I was mainly involved in projects that work with pronunciation modeling and Korean dialect modeling for NAVER CLOVA Speech product and impaired speech diagnostic modeling for NAVER CLOVA CareCall product. I also worked on enhancing performance on dialectal speech for the in-house speech recognition model. I mainly worked with various toolkits involving Kaldi, PyTorch, Tensor, etc. My main programming language was Python.

</aside>

Publications

(*,† denote equal contribution)

(All proceeding publications were peer-reviewed with full papers)

📄 Fairness of Automatic Speech Recognition: Looking Through a Philosophical Lens

Choi, A. S. G., & Choi, H.

(Forthcoming) Proceedings of AIES 2025

📄 Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis

Choi, A. S. G., Richardson, A., Partlan, R., Tang, S. X., & Cho, S.

(Forthcoming) Proceedings of Interspeech 2025

→ paper here

📄 Reasoning-Based Approach with Chain-of-Thought for Alzheimer’s Detection Using Speech and Large Language Models

Park, C., Choi, A. S. G., Cho, S., & Kim, C.