Mingge Lu 卢铭阁
Logo Master Student in CS @ USTC
Logo B.Eng. in AI, School of the Gifted Young @ USTC
I am a Master student in Computer Science at University of Science and Technology of China (USTC), under the supervision of Prof. Guangzhong Sun and Prof. Jingwei Sun. I received my Bachelor degree in the Talent Program in AI (Honor) from School of the Gifted Young, USTC in 2024.
My research interests lie broadly in designing efficient LLM Inference systems, including model compression algorithms (sparsification, quantization), and high performance sparse GPU kernels leveraging tensor core architecture.

Education
  • University of Science and Technology of China
    University of Science and Technology of China
    Master Student, School of Computer Science and Technology
    Sep. 2024 - present
  • University of Science and Technology of China
    University of Science and Technology of China
    B.Eng. in Artificial Intelligence (Honor), School of the Gifted Young
    Sep. 2020 - Jul. 2024
Honors & Awards
  • Shenzhen Stock Exchange Scholarship
    2025
  • Master Student Academic Scholarship
    2024, 2025
  • Outstanding Student Scholarship Silver Award
    2022, 2023
  • Zhang Zongzhi Sci-Tech Scholarship
    2021
News
2025
One paper has been accepted by NeurIPS 2025.
Sep 19
Selected Publications (view all )
Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models
Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Mingge Lu, Jingwei Sun, Junqing Lin, Zechun Zhou, Guangzhong Sun

Advances in Neural Information Processing Systems (NeurIPS) 2025

We propose Lua-LLM (Learning unstructured-sparsity allocation in LLMs), a learning-based global pruning framework that explores the optimal unstructured sparsity allocation. Unlike existing pruning methods, which primarily focus on allocating per-layer sparsity, Lua-LLM achieves flexible allocation for both layer-wise and intra-layer sparsity.

Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Mingge Lu, Jingwei Sun, Junqing Lin, Zechun Zhou, Guangzhong Sun

Advances in Neural Information Processing Systems (NeurIPS) 2025

We propose Lua-LLM (Learning unstructured-sparsity allocation in LLMs), a learning-based global pruning framework that explores the optimal unstructured sparsity allocation. Unlike existing pruning methods, which primarily focus on allocating per-layer sparsity, Lua-LLM achieves flexible allocation for both layer-wise and intra-layer sparsity.

Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage
Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage

Junqing Lin, Jingwei Sun, Mingge Lu, Guangzhong Sun

arXiv:2507.12205

This paper presents EC-SpMV, a GPU-optimized SpMV approach for accelerating sparse LLM inference. EC-SpMV introduces (1) a hierarchical block extraction algorithm that captures multiple granularities of block structures within sparse LLMs, and (2) a novel compressed sparse format (EC-CSR) that employs delta indexing to reduce storage overhead and enhance memory access efficiency.

Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage

Junqing Lin, Jingwei Sun, Mingge Lu, Guangzhong Sun

arXiv:2507.12205

This paper presents EC-SpMV, a GPU-optimized SpMV approach for accelerating sparse LLM inference. EC-SpMV introduces (1) a hierarchical block extraction algorithm that captures multiple granularities of block structures within sparse LLMs, and (2) a novel compressed sparse format (EC-CSR) that employs delta indexing to reduce storage overhead and enhance memory access efficiency.

All publications