Select Publications
Books
2005, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics: Preface
,2004, Preface
,2000, Loop Tiling for Parallelism, Kluwer Academic Publishers, Boston
,Book Chapters
2021, 'Automatic Synthesis of Data-Flow Analyzers', in Static Analysis, pp. 453 - 478, http://dx.doi.org/10.1007/978-3-030-88806-0_22
,2021, 'Selective Context-Sensitivity for k-CFA with CFL-Reachability', in Static Analysis, pp. 261 - 285, http://dx.doi.org/10.1007/978-3-030-88806-0_13
,2006, 'Code tiling: one size fits all', in Yang L; Guo M (ed.), High-performance computing: paradigm and infrastructure, Wiley & Sons, USA, pp. 219 - 240
,2005, 'Code tiling: One size fits all', in High-Performance Computing: Paradigm and Infrastructure, pp. 219 - 240, http://dx.doi.org/10.1002/0471732710.ch11
,1994, 'Adapting a sequential algorithm for a systolic design', in Transformational Approaches to Systolic Design, Chapman & Hall, pp. 179 - 204, https://www.amazon.com/Transformational-Approaches-Systolic-Distributed-Computing/dp/0412448300?ie=UTF8&*Version*=1&*entries*=0
,1991, 'Specifying control signals for one-dimensional systolic arrays by uniform recurrence equations', in Algorithms and Parallel VLSI Architectures II, Elsevier, pp. 181 - 187, https://books.google.com.au/books/about/Algorithms_and_parallel_VLSI_architectur.html?id=nZJQAAAAMAAJ&redir_esc=y
,Journal articles
2024, 'Boosting the Performance of Alias-Aware IFDS Analysis with CFL-Based Environment Transformers', Proceedings of the ACM on Programming Languages, 8, http://dx.doi.org/10.1145/3689804
,2024, 'TIPS: Tracking Integer-Pointer Value Flows for C++ Member Function Pointers', Proceedings of the ACM on Software Engineering, 1, pp. 1609 - 1631, http://dx.doi.org/10.1145/3660779
,2024, 'Iterative-Epoch Online Cycle Elimination for Context-Free Language Reachability', Proceedings of the ACM on Programming Languages, 8, http://dx.doi.org/10.1145/3649862
,2024, 'A Smart Status Based Monitoring Algorithm for the Dynamic Analysis of Memory Safety', ACM Transactions on Software Engineering and Methodology, 33, http://dx.doi.org/10.1145/3637227
,2024, 'Pearl: A Multi-Derivation Approach to Efficient CFL-Reachability Solving', IEEE Transactions on Software Engineering, 50, pp. 2379 - 2397, http://dx.doi.org/10.1109/TSE.2024.3437684
,2023, 'Automatic Target Description File Generation', Journal of Computer Science and Technology, 38, pp. 1339 - 1355, http://dx.doi.org/10.1007/s11390-022-1919-x
,2023, 'A Container-Usage-Pattern-Based Context Debloating Approach for Object-Sensitive Pointer Analysis', Proceedings of the ACM on Programming Languages, 7, http://dx.doi.org/10.1145/3622832
,2023, 'Effective Stack Wear Leveling for NVM', IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42, pp. 3250 - 3263, http://dx.doi.org/10.1109/TCAD.2023.3240873
,2023, 'VTensor: Using Virtual Tensors to Build a Layout-Oblivious AI Programming Framework', Journal of Computer Science and Technology, 38, pp. 1074 - 1097, http://dx.doi.org/10.1007/s11390-022-1457-6
,2023, 'IFDS-based Context Debloating for Object-Sensitive Pointer Analysis', ACM Transactions on Software Engineering and Methodology, 32, http://dx.doi.org/10.1145/3579641
,2023, 'A Source-Level Instrumentation Framework for the Dynamic Analysis of Memory Safety', IEEE Transactions on Software Engineering, 49, pp. 2107 - 2127, http://dx.doi.org/10.1109/TSE.2022.3210580
,2023, 'Selecting Context-Sensitivity Modularly for Accelerating Object-Sensitive Pointer Analysis', IEEE Transactions on Software Engineering, 49, pp. 719 - 742, http://dx.doi.org/10.1109/TSE.2022.3162236
,2022, 'A Flexible Yet Efficient DNN Pruning Approach for Crossbar-Based Processing-in-Memory Architectures', IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41, pp. 3745 - 3756, http://dx.doi.org/10.1109/TCAD.2022.3197510
,2022, 'ReaDy: A ReRAM-Based Processing-in-Memory Accelerator for Dynamic Graph Convolutional Networks', IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41, pp. 3567 - 3578, http://dx.doi.org/10.1109/TCAD.2022.3199152
,2022, 'Practical Software-Based Shadow Stacks on x86-64', ACM Transactions on Architecture and Code Optimization, 19, http://dx.doi.org/10.1145/3556977
,2022, 'Buddy Stacks: Protecting Return Addresses with Efficient Thread-Local Storage and Runtime Re-Randomization', ACM Transactions on Software Engineering and Methodology, 31, pp. 1 - 37, http://dx.doi.org/10.1145/3494516
,2022, 'Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning', Journal of Systems Architecture, 124, http://dx.doi.org/10.1016/j.sysarc.2022.102431
,2022, 'CloudRaid: Detecting Distributed Concurrency Bugs via Log Mining and Enhancement', IEEE Transactions on Software Engineering, 48, pp. 662 - 677, http://dx.doi.org/10.1109/TSE.2020.2999364
,2021, 'Eagle: CFL-Reachability-Based Precision-Preserving Acceleration of Object-Sensitive Pointer Analysis with Partial Context Sensitivity', ACM Transactions on Software Engineering and Methodology, 30, http://dx.doi.org/10.1145/3450492
,2021, 'Guest Editorial: Special Section on New Trends in Parallel and Distributed Computing for Human Sensible Applications', IEEE Transactions on Emerging Topics in Computing, 9, pp. 1640 - 1641, http://dx.doi.org/10.1109/TETC.2021.3113485
,2021, 'Preface', Journal of Computer Science and Technology, 36, http://dx.doi.org/10.1007/s11390-021-0001-4
,2020, 'Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent Edge Devices', IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39, pp. 3614 - 3626, http://dx.doi.org/10.1109/TCAD.2020.3013050
,2020, 'Value-Flow-Based Demand-Driven Pointer Analysis for C and C++', IEEE Transactions on Software Engineering, 46, pp. 812 - 835, http://dx.doi.org/10.1109/TSE.2018.2869336
,2020, 'A Conflict-free Scheduler for High-performance Graph Processing on Multi-pipeline FPGAs', ACM Transactions on Architecture and Code Optimization, 17, http://dx.doi.org/10.1145/3390523
,2019, 'DNntune: Automatic benchmarking DNN models for mobile-cloud computing', ACM Transactions on Architecture and Code Optimization, 16, http://dx.doi.org/10.1145/3368305
,2019, 'Precision-preserving yet fast object-sensitive pointer analysis with partial context sensitivity', Proceedings of the ACM on Programming Languages, 3, pp. 1 - 29, http://dx.doi.org/10.1145/3360574
,2019, 'LCCFS: a lightweight distributed file system for cloud computing without journaling and metadata services', Science China Information Sciences, 62, http://dx.doi.org/10.1007/s11432-017-9295-4
,2019, 'Understanding and analyzing Java reflection', ACM Transactions on Software Engineering and Methodology, 28, http://dx.doi.org/10.1145/3295739
,2018, 'Parallel construction of interprocedural memory SSA form', Journal of Systems and Software, 146, pp. 186 - 195, http://dx.doi.org/10.1016/j.jss.2018.09.038
,2018, 'SCP: Shared cache partitioning for high-performance GEMM', ACM Transactions on Architecture and Code Optimization, 15, http://dx.doi.org/10.1145/3274654
,2018, 'Ripple: Reflection analysis for Android apps in incomplete information environments', Software - Practice and Experience, 48, pp. 1419 - 1437, http://dx.doi.org/10.1002/spe.2577
,2018, 'Loop-Oriented pointer analysis for automatic SIMD vectorization', ACM Transactions on Embedded Computing Systems, 17, http://dx.doi.org/10.1145/3168364
,2018, 'Poker: Permutation-based SIMD execution of intensive tree search by path encoding', ACM Transactions on Architecture and Code Optimization, 15, http://dx.doi.org/10.1145/3280850
,2017, 'Enhanced Peripheral Nerve Regeneration by a High Surface Area to Volume Ratio of Nerve Conduits Fabricated from Hydroxyethyl Cellulose/Soy Protein Composite Sponges', ACS OMEGA, 2, pp. 7471 - 7481, http://dx.doi.org/10.1021/acsomega.7b01003
,2017, 'AutoFix', ACM SIGAPP Applied Computing Review, 16, pp. 38 - 50, http://dx.doi.org/10.1145/3040575.3040579
,2017, 'Fine grained, direct access file system support for storage class memory', Journal of Systems Architecture, 72, pp. 80 - 92, http://dx.doi.org/10.1016/j.sysarc.2016.07.003
,2017, 'An Efficient WCET-Aware Instruction Scheduling and Register Allocation Approach for Clustered VLIW Processors', ACM Trans. Embed. Comput. Syst., 16, pp. 120:1 - 120:21, http://dx.doi.org/10.1145/3126524
,2016, 'Reducing Static Energy in Supercomputer Interconnection Networks Using Topology-Aware Partitioning', IEEE Transactions on Computers, 65, pp. 2588 - 2602, http://dx.doi.org/10.1109/TC.2015.2493523
,2016, 'Predicting Cross-Core Performance Interference on Multicore Processors with Regression Analysis', IEEE Transactions on Parallel and Distributed Systems, 27, pp. 1443 - 1456, http://dx.doi.org/10.1109/TPDS.2015.2442983
,2016, 'A compiler approach for exploiting partial SIMD parallelism', ACM Transactions on Architecture and Code Optimization, 13, pp. 11:1 - 11:26, http://dx.doi.org/10.1145/2886101
,2016, 'An Efficient GPU Implementation of Inclusion-Based Pointer Analysis', IEEE Transactions on Parallel and Distributed Systems, 27, pp. 353 - 366, http://dx.doi.org/10.1109/TPDS.2015.2397933
,