Select Publications
Preprints
, 2025, MLLM-based Speech Recognition: When and How is Multimodality Beneficial?, http://dx.doi.org/10.48550/arxiv.2507.19037
, 2025, Improving Named Entity Transcription with Contextual LLM-based Revision, http://dx.doi.org/10.48550/arxiv.2506.10779
, 2025, Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?, http://dx.doi.org/10.48550/arxiv.2409.09221
, 2024, Discriminating retinal microvascular and neuronal differences related to migraines: Deep Learning based Crossectional Study, http://dx.doi.org/10.48550/arxiv.2408.07293
, 2024, Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing, http://dx.doi.org/10.48550/arxiv.2406.06582