Select Publications
Journal articles
2013, 'Session details: Programming language and implementation', ACM SIGPLAN Notices, 48, http://dx.doi.org/10.1145/3262901
,2013, 'Epipe: A low-cost fault-tolerance technique considering WCET constraints', Journal of Systems Architecture, 59, pp. 1383 - 1393, http://dx.doi.org/10.1016/j.sysarc.2013.06.003
,2013, 'Instruction scheduling with k-successor tree for clustered VLIW processors', Design Automation for Embedded Systems, 17, pp. 439 - 458, http://dx.doi.org/10.1007/s10617-012-9103-0
,2013, 'Programming for scientific computing on peta-scale heterogeneous parallel systems', Journal of Central South University, 20, pp. 1189 - 1203, http://dx.doi.org/10.1007/s11771-013-1602-z
,2012, 'Phosphorylation of syndapin I F-BAR domain at two helix-capping motifs regulates membrane tubulation', Proceedings of the National Academy of Sciences of the United States of America, 109, pp. 3760 - 3765, http://dx.doi.org/10.1073/pnas.1108294109
,2012, 'Optimally Maximizing Iteration-Level Loop Parallelism', IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 23, pp. 564 - 572, http://dx.doi.org/10.1109/TPDS.2011.171
,2012, 'Optimizing modulo scheduling to achieve reuse and concurrency for stream processors', Journal of Supercomputing, 59, pp. 1229 - 1251, http://dx.doi.org/10.1007/s11227-010-0522-z
,2012, 'Optimizing modulo scheduling to achieve reuse and concurrency for stream processors', Journal of Supercomputing, 59, pp. 1229 - 1251, http://dx.doi.org/10.1007/s11227-010-0522-z
,2012, 'Parallelizing SOR for GPGPUs Using Alternate Loop Tiling', Parallel Computing, 38, pp. 310 - 328, http://dx.doi.org/10.1016/j.parco.2012.03.004
,2011, 'Leakage-aware modulo scheduling for embedded VLIW processors', Journal of Computer Science and Technology, 26, pp. 405 - 417, http://dx.doi.org/10.1007/s11390-011-1143-6
,2009, 'Comparison of quantum dots immunofluorescence histochemistry and conventional immunohistochemistry for the detection of caveolin-1 and PCNA in the lung cancer tissue microarray', JOURNAL OF MOLECULAR HISTOLOGY, 40, pp. 261 - 268, http://dx.doi.org/10.1007/s10735-009-9237-y
,2009, 'State of the art of Micro-CT applications in dental research.', International journal of oral science, 1, pp. 177 - 188, http://dx.doi.org/10.4248/IJOS09031
,2009, 'Compiler-directed scratchpad memory management via graph coloring', ACM Transactions on Architecture and Code Optimization, 6
,2009, 'Loop recreation for thread-level speculation on multicore processors', Software: Practice and Experience, pp. 45 - 72, http://dx.doi.org/10.1002/spe.947
,2009, 'PARBLO: Page-Allocation-Based DRAM Row Buffer Locality Optimization', Journal of Computer Science and Technology, 24, pp. 1086 - 1097
,2008, 'Factorization of singular integer matrices', Linear Algebra and its Applications, 428, pp. 1046 - 1055, http://dx.doi.org/10.1016/j.laa.2007.09.012
,2008, 'Improving the parallelism of iterative methods by aggressive loop fusion', Journal of Supercomputing, 43, pp. 147 - 164, http://dx.doi.org/10.1007/s11227-007-0124-6
,2008, 'Minimal placement of bank selection instructions for partitioned memory architectures', ACM Transactions on Embedded Computing Systems (TECS), 7, pp. 1 - 32, http://dx.doi.org/10.1145/1331331.1331336
,2007, 'Scratchpad allocation for data aggregates in superperfect graphs', ACM SIGPLAN Notices, 42, pp. 207 - 216
,2007, 'Data cache locking for tight timing calculations', ACM Transactions on Embedded Computing Systems (TECS), 7, pp. 1 - 38
,2007, 'Interprocedural side-effect analysis for incomplete object-oriented software modules', Journal of Systems and Software, 80, pp. 92 - 105, http://dx.doi.org/10.1016/j.jss.2006.06.015
,2007, 'Trace-based leakage energy optimisations at link time', Journal of Systems Architecture, 53, pp. 1 - 20, http://dx.doi.org/10.1016/j.sysarc.2006.05.002
,2006, 'Message from HPSEC workshop co-chairs', Proceedings of the International Conference on Parallel Processing Workshops, http://dx.doi.org/10.1109/ICPPW.2006.46
,2006, 'Special Section on Parallel/Distributed Computing and Networking', IEICE Transactions on Information and Systems, E89-D, pp. 387 - 388, http://dx.doi.org/10.1093/ietisy/e89-d.2.387
,2006, 'A lifetime optimal algorithm for speculative PRE', ACM Transactions on Architecture and Code Optimization, 3, pp. 115 - 155
,2006, 'Partial dead code elimination on predicated code regions', Journal of Systems Architecture, 36, pp. 1655 - 1685
,2004, 'Efficient and Accurate Analytical Modeling of Whole-Program Data Cache Behaviour', IEEE Transactions on Computers, 53, pp. 547 - 566
,2002, 'Eigenvectors-Based Parallelisation of Nested Loops with Affine Dependences', Parallel Algorithms and Applications, pp. 237 - 248
,2002, 'EIGENVECTORS-BASED PARALLELISATION OF NESTED LOOPS WITH AFFINE DEPENDENCES', Parallel Algorithms and Applications, 17, pp. 227 - 248, http://dx.doi.org/10.1080/01495730108941442
,2002, 'Space-Time Equations for Non-Unimodular Mappings', International Journal of Computer Mathematics, 79, pp. 555 - 572, http://dx.doi.org/10.1080/00207160210953
,2002, 'Time-Minimal Tiling When Rise Is Larger Than Zero', Parallel Computing, pp. 915 - 936
,2000, 'Generating efficient tiled code for distributed memory machines', Parallel Computing, 26, pp. 1369 - 1410, http://dx.doi.org/10.1016/S0167-8191(00)00040-5
,1999, 'Partitioning and Scheduling Loops on NOWs', Computer Communications, pp. 1017 - 1033
,1998, 'Reuse-Driven Tiling for Improving Data Locality', International Journal of Parallel Programming, 26, http://dx.doi.org/10.1023/A:1018734612524
,1997, 'On tiling as a loop transformation', Parallel Processing Letters, 7, pp. 409 - 424, http://dx.doi.org/10.1142/S0129626497000401
,1997, 'Communication-Minimal Tiling of Uniform Dependence Loops', Journal of Parallel and Distributed Computin, 42, pp. 42 - 59, http://dx.doi.org/10.1006/jpdc.1997.1310
,1997, 'On Tiling as a Loop Transformation', Parallel Processing Letters, 07, pp. 409 - 424, http://dx.doi.org/10.1142/S0129626497000401
,1997, 'Unimodular transformations of non-perfectly nested loops', Parallel Computing, 22, pp. 1621 - 1645, http://dx.doi.org/10.1016/S0167-8191(96)00063-4
,1996, 'Generalising the unimodular approach to restructure imperfectly nested loops', Parallel Processing Letters, 6, pp. 401 - 414, http://dx.doi.org/10.1142/S0129626496000388
,1996, 'GENERALISING THE UNIMODULAR APPROACH TO RESTRUCTURE IMPERFECTLY NESTED LOOPS', Parallel Processing Letters, 06, pp. 401 - 414, http://dx.doi.org/10.1142/S0129626496000388
,1996, 'Transformations of nested loops with non-convex iteration spaces', Parallel Computing, 22, pp. 339 - 368, http://dx.doi.org/10.1016/0167-8191(95)00069-0
,1995, 'Closed-form mapping conditions for the synthesis of linear processor arrays', Journal of VLSI signal processing systems for signal, image and video technology, 10, pp. 181 - 199, http://dx.doi.org/10.1007/BF02407035
,1994, 'Automating non-unimodular loop transformations for massive parallelism', Parallel Computing, 20, pp. 711 - 728, http://dx.doi.org/10.1016/0167-8191(94)90002-7
,1992, 'A systolic array for pyramidal algorithms', Journal of VLSI Signal Processing, 4, pp. 89, http://dx.doi.org/10.1007/BF00930620
,1992, 'ON THE LOADING, RECOVERY AND ACCESS OF STATIONARY DATA IN SYSTOLIC ARRAYS', LECTURE NOTES IN COMPUTER SCIENCE, 634, pp. 259 - 264, https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:A1992KQ20400031&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=891bb5ab6ba270e68a29b250adbe88d1
,1992, 'The synthesis of control signals for one-dimensional systolic arrays', Integration, the VLSI Journal, 14, pp. 1 - 32, http://dx.doi.org/10.1016/0167-9260(92)90008-M
,1991, 'A systolic array for pyramidal algorithms', Journal of VLSI signal processing systems for signal, image and video technology, 3, pp. 237 - 257, http://dx.doi.org/10.1007/BF00925834
,1991, 'SPECIFYING CONTROL SIGNALS FOR SYSTOLIC ARRAYS BY UNIFORM RECURRENCE EQUATIONS', Parallel Processing Letters, 01, pp. 83 - 93, http://dx.doi.org/10.1142/S0129626491000033
,1988, 'A new data structure for representing cell hierarchy in layout design', Computers & Graphics, 12, pp. 341 - 348, http://dx.doi.org/10.1016/0097-8493(88)90055-6
,