Select Publications
Book Chapters
, 2023, 'Convolutional neural networks and architectures', in Handbook of Face Recognition, pp. 37 - 65, http://dx.doi.org/10.1007/978-3-031-43567-6_2
Journal articles
, 2025, 'Aligning Speech to Languages to Enhance Code-Switching Speech Recognition', IEEE Transactions on Audio Speech and Language Processing, 33, pp. 4712 - 4725, http://dx.doi.org/10.1109/TASLPRO.2025.3629290
, 2025, 'Selective State Space Model for Monaural Speech Enhancement', IEEE Transactions on Consumer Electronics, 71, pp. 5414 - 5424, http://dx.doi.org/10.1109/TCE.2024.3523297
, 2025, 'Mamba in Speech: Towards an Alternative to Self-Attention', IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 33, pp. 1933 - 1948, http://dx.doi.org/10.1109/TASLPRO.2025.3566210
, 2024, 'Context-aware code generation with synchronous bidirectional decoder', Journal of Systems and Software, 214, http://dx.doi.org/10.1016/j.jss.2024.112066
, 2024, 'Exploring Recurrent Long-Term Temporal Fusion for Multi-View 3D Perception', IEEE Robotics and Automation Letters, 9, pp. 6544 - 6551, http://dx.doi.org/10.1109/LRA.2024.3401172
, 2024, 'Chain-of-Thought in Neural Code Generation: From and for Lightweight Language Models', IEEE Transactions on Software Engineering, 50, pp. 2437 - 2457, http://dx.doi.org/10.1109/TSE.2024.3440503
, 2024, 'CodeScore-R: An automated robustness metric for assessing the functional correctness of code synthesis', Jisuanji Yanjiu Yu Fazhan Computer Research and Development, 61, pp. 291 - 306, http://dx.doi.org/10.7544/issn1000-1239.202330715
, 2023, 'A syntax-guided multi-task learning approach for Turducken-style code generation', Empirical Software Engineering, 28, http://dx.doi.org/10.1007/s10664-023-10372-1
, 2023, 'Solve High-Dimensional Reflected Partial Differential Equations by Neural Network Method', Mathematical and Computational Applications, 28, http://dx.doi.org/10.3390/mca28040079
, 2023, 'ExploitGen: Template-augmented exploit code generation based on CodeBERT', Journal of Systems and Software, 197, http://dx.doi.org/10.1016/j.jss.2022.111577
, 2023, 'Scale-Aware Automatic Augmentations for Object Detection With Dynamic Training', IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, pp. 2367 - 2383, http://dx.doi.org/10.1109/TPAMI.2022.3166905
, 2023, 'Twin-S: a digital twin for skull base surgery', INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 18, pp. 1077 - 1084, http://dx.doi.org/10.1007/s11548-023-02863-9
, 2022, 'PointINS: Point-Based Instance Segmentation', IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, pp. 6377 - 6392, http://dx.doi.org/10.1109/TPAMI.2021.3085295
, 2022, 'Weight-Dependent Gates for Network Pruning', IEEE Transactions on Circuits and Systems for Video Technology, 32, pp. 6941 - 6954, http://dx.doi.org/10.1109/TCSVT.2022.3175762
, 2021, 'Joint Multi-Dimension Pruning via Numerical Gradient Update', IEEE Transactions on Image Processing, 30, pp. 8034 - 8045, http://dx.doi.org/10.1109/TIP.2021.3112041
, 2017, 'Object detection networks on convolutional feature maps', IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, pp. 1476 - 1481, http://dx.doi.org/10.1109/TPAMI.2016.2601099
, 2016, 'Accelerating Very Deep Convolutional Networks for Classification and Detection', IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, pp. 1943 - 1955, http://dx.doi.org/10.1109/TPAMI.2015.2502579
, 2015, 'Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition', IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, pp. 1904 - 1916, http://dx.doi.org/10.1109/TPAMI.2015.2389824
Conference Papers
, 2025, 'Language Prompt for Autonomous Driving', in Proceedings of the Aaai Conference on Artificial Intelligence, pp. 8359 - 8367, http://dx.doi.org/10.1609/aaai.v39i8.32902
, 2025, 'SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control', in Proceedings of the Aaai Conference on Artificial Intelligence, pp. 3617 - 3625, http://dx.doi.org/10.1609/aaai.v39i4.32376
, 2025, 'Auto-Landmark: Acoustic Landmark Dataset and Open-Source Toolkit for Landmark Extraction', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4263 - 4267, http://dx.doi.org/10.21437/Interspeech.2025-17
, 2025, 'Beyond Sequences: Two-dimensional Representation and Dependency Encoding for Code Generation', in Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 6157 - 6172
, 2025, 'Defending Llms Against Jailbreak Prompts Through Key Information Protection and Selective Compression', in IEEE International Conference on Software Quality Reliability and Security Qrs, pp. 58 - 67, http://dx.doi.org/10.1109/QRS65678.2025.00017
, 2025, 'GLAD: A STREAMING SCENE GENERATOR FOR AUTONOMOUS DRIVING', in 13th International Conference on Learning Representations Iclr 2025, pp. 101163 - 101180
, 2025, 'Merlin: Empowering Multimodal LLMs with Foresight Minds', in Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, pp. 425 - 443, http://dx.doi.org/10.1007/978-3-031-73235-5_24
, 2025, 'Multi-Class Dementia Detection Using Acoustic Features - ICASSP-2025 PROCESS Challenge', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10889847
, 2025, 'RECONSTRUCTIVE VISUAL INSTRUCTION TUNING', in 13th International Conference on Learning Representations Iclr 2025, pp. 15001 - 15026
, 2025, 'Rethinking Mamba in Speech Processing by Self-Supervised Models', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10889111
, 2025, 'Stream Query Denoising for Vectorized HD-Map Construction', in Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, pp. 203 - 220, http://dx.doi.org/10.1007/978-3-031-72655-2_12
, 2025, 'Vary: Scaling up the Vision Vocabulary for Large Vision-Language Model', in Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, pp. 408 - 424, http://dx.doi.org/10.1007/978-3-031-73235-5_23
, 2025, 'SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information', in Findings of the Association for Computational Linguistics: ACL 2025, Association for Computational Linguistics, pp. 10019 - 10030, presented at Findings of the Association for Computational Linguistics: ACL 2025, - , http://dx.doi.org/10.18653/v1/2025.findings-acl.521
, 2024, 'OneChart: Purify the Chart Structural Extraction via One Auxiliary Token', in Mm 2024 Proceedings of the 32nd ACM International Conference on Multimedia, pp. 147 - 155, http://dx.doi.org/10.1145/3664647.3681167
, 2024, 'Self-Supervised Visual Preference Alignment', in Mm 2024 Proceedings of the 32nd ACM International Conference on Multimedia, pp. 291 - 300, http://dx.doi.org/10.1145/3664647.3680993
, 2024, 'UNIDIRECTIONAL BRAIN-COMPUTER INTERFACE: ARTIFICIAL NEURAL NETWORK ENCODING NATURAL IMAGES TO fMRI RESPONSE IN THE VISUAL CORTEX', in 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, IEEE, SOUTH KOREA, Seoul, pp. 1851 - 1855, presented at 49th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), SOUTH KOREA, Seoul, 14 April 2024 - 19 April 2024, http://dx.doi.org/10.1109/ICASSP48485.2024.10446366
, 2024, 'Compound Text-Guided Prompt Tuning via Image-Adaptive Cues', in Proceedings of the Aaai Conference on Artificial Intelligence, pp. 5061 - 5069, http://dx.doi.org/10.1609/aaai.v38i5.28311
, 2024, 'DDAE: Towards Deep Dynamic Vision BERT Pretraining', in Proceedings of the Aaai Conference on Artificial Intelligence, pp. 1037 - 1045, http://dx.doi.org/10.1609/aaai.v38i2.27864
, 2024, 'Far3D: Expanding the Horizon for Surround-View 3D Object Detection', in Proceedings of the Aaai Conference on Artificial Intelligence, pp. 2561 - 2569, http://dx.doi.org/10.1609/aaai.v38i3.28033
, 2024, 'Binaural Selective Attention Model for Target Speaker Extraction', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4323 - 4327, http://dx.doi.org/10.21437/Interspeech.2024-683
, 2024, 'ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning', in Ijcai International Joint Conference on Artificial Intelligence, pp. 1743 - 1752
, 2024, 'DREAMLLM: SYNERGISTIC MULTIMODAL COMPREHENSION AND CREATION', in 12th International Conference on Learning Representations Iclr 2024
, 2024, 'ENHANCING CODE-SWITCHING SPEECH RECOGNITION WITH INTERACTIVE LANGUAGE BIASES', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 10886 - 10890, http://dx.doi.org/10.1109/ICASSP48485.2024.10448335
, 2024, 'Panacea: Panoramic and Controllable Video Generation for Autonomous Driving', in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 6902 - 6912, http://dx.doi.org/10.1109/CVPR52733.2024.00659
, 2024, 'Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 159 - 171, http://dx.doi.org/10.18653/v1/2024.emnlp-main.9
, 2024, 'Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy', in Teachnlp 2024 6th Workshop on Teaching Nlp Proceedings of the Workshop, pp. 23 - 32
, 2024, 'When LLMs Meet Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 146 - 158, http://dx.doi.org/10.18653/v1/2024.emnlp-main.8
, 2023, 'A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095885
, 2023, 'Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining', in Proceedings of Machine Learning Research, pp. 28223 - 28243
, 2023, 'Cross Modal Transformer: Towards Fast and Robust 3D Object Detection', in Proceedings of the IEEE International Conference on Computer Vision, pp. 18222 - 18232, http://dx.doi.org/10.1109/ICCV51070.2023.01675
, 2023, 'Differentiable Architecture Search with Random Features', in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 16060 - 16069, http://dx.doi.org/10.1109/CVPR52729.2023.01541