Biography
I am Ziyi Guan (管子义), a fourth-year PhD candidate at The Univerisity of Hong Kong (HKU), supervised by Dr. Ngai Wong and Prof. Graziano Chesi. I am expected to graduate from The University of Hong Kong (HKU) in September 2025. Before that, I received my Bachelor’s degree from the School of Microelectronics at the Southern University of Science and Technology in 2021, supervised by Prof. Hao Yu.
My major research interests lie in Large Language Model (LLM) compression techniques such as weight Quantization and Pruning. Also, I’m interested in LLM Agent techniques, espically in APP/GUI-based Agent and Retrieval-Augmented Generation filed. You can find my publication from my Google Scholar
I am currently working as a research intern at Huawei Hong Kong Research Center(HKRC) (starting from November 2024).
I am actively seeking job opportunities starting in Fall 2025 in the field of Large Language Models (LLM), particularly in model optimization, pruning, quantization, and hardware-efficient neural network design. If you have a relevant position or collaboration opportunity, please feel free to contact me.
You can find my Engilsh CV here English CV and Chinese CV here Chinese CV
You can contact me by my Email or by my WeChat: Easongzy
Selected Publications (*represents equal contribution)
Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu, Graziano Chesi, Ngai Wong, Hao Yu, “LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models”, In Proceedings of DAC 2025: 61st IEEE/ACM Design Automation Conference. (DAC) (Under review) PDF
Dingbang Liu, Ziyi Guan, Qilong Chen, Jiaqi Yang, Kai Li, Mingqiang Huang , Changwen Chen, Ngai Wong, Hao Yu. ”A Highly Energy-Efficient Binary BERT Model on Group Vector Systolic CIM Accelerator”, In Proceedings of DAC 2025: 61st IEEE/ACM Design Automation Conference. (DAC), (Under review)
Ziyi Guan, Hantao Huang, Yupeng Su, Hong Huang, Ngai Wong and Hao Yu, “APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models”, In Proceedings of DAC 2024: 61st IEEE/ACM Design Automation Conference. (DAC), San Francisco, CA, June 23-27, 2024. PDF
Ziyi Guan, Boyu Li, Yuan Ren, Muqun Niu, Hantao Huang, Graziano Chesi, Hao Yu and Ngai Wong, “An Isotropic Shift-Pointwise Network for Crossbar-Efficient Neural Network Design”, Design, Automation & Test in Europe Conference & Exhibition (DATE), March 25, Valencia, 2024. PDF
Shuwei Li, Ziyi Guan, Changhai Man, Ao Shen, Wei Mao, Shaobo Luo, Rumin Zhang, and Hao Yu. 2022. “A Fall Detection Network by 2D/3D Spatio-temporal Joint Models with Tensor Compression on Edge.” in ACM Transactions on Embedded Computing Systems (TECS) vol. 21, no. 6, pp. 1–19, 2022 PDF
Ziyi Guan, Wenyong Zhou, Yuan Ren, Rui Xie, Hao Yu, and Ngai Wong. 2022. “A Hardware-Aware Neural Architecture Search Pareto Front Exploration for In-Memory Computing.” in 2022 IEEE 16th International Conference on Solid-State Integrated Circuit Technology (ICSICT). IEEE, 2022, pp. 1–4. pdf
Ziyi Guan, Shuwei Li, Yuan Cheng, Changhai Man, Wei Mao, Ngai Wong, and Hao Yu, “A Video-based Fall Detection Network by Spatio-temporal Joint-point Model on Edge Devices”, Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2021, pp. 422–427. pdf