Homepage - Mingge Lu

Mingge Lu 卢铭阁

Master Student in CS @ USTC

B.Eng. in AI, School of the Gifted Young @ USTC

I am a Master student in Computer Science at University of Science and Technology of China (USTC), under the supervision of Prof. Guangzhong Sun and Prof. Jingwei Sun. I received my Bachelor degree in the Talent Program in AI (Honor) from School of the Gifted Young, USTC in 2024.
My research interests lie broadly in designing efficient LLM Inference systems, including model compression algorithms (sparsification, quantization), and high performance sparse GPU kernels leveraging tensor core architecture.

mingge(at)mail.ustc.edu.cn GitHub LinkedIn ORCID

Education

University of Science and Technology of China

Master Student, School of Computer Science and Technology

Sep. 2024 - present
University of Science and Technology of China

B.Eng. in Artificial Intelligence (Honor), School of the Gifted Young

Sep. 2020 - Jul. 2024

Honors & Awards

Shenzhen Stock Exchange Scholarship

2025
Master Student Academic Scholarship

2024, 2025
Outstanding Student Scholarship Silver Award

2022, 2023
Zhang Zongzhi Sci-Tech Scholarship

2021

News

2025

One paper has been accepted by NeurIPS 2025.

Sep 19

Selected Publications (view all )

Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Mingge Lu, Jingwei Sun, Junqing Lin, Zechun Zhou, Guangzhong Sun

Advances in Neural Information Processing Systems (NeurIPS) 2025

We propose Lua-LLM (Learning unstructured-sparsity allocation in LLMs), a learning-based global pruning framework that explores the optimal unstructured sparsity allocation. Unlike existing pruning methods, which primarily focus on allocating per-layer sparsity, Lua-LLM achieves flexible allocation for both layer-wise and intra-layer sparsity.

[Openreview]

Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Mingge Lu, Jingwei Sun, Junqing Lin, Zechun Zhou, Guangzhong Sun

Advances in Neural Information Processing Systems (NeurIPS) 2025

[Openreview]

Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage

Junqing Lin, Jingwei Sun, Mingge Lu, Guangzhong Sun

arXiv:2507.12205

This paper presents EC-SpMV, a GPU-optimized SpMV approach for accelerating sparse LLM inference. EC-SpMV introduces (1) a hierarchical block extraction algorithm that captures multiple granularities of block structures within sparse LLMs, and (2) a novel compressed sparse format (EC-CSR) that employs delta indexing to reduce storage overhead and enhance memory access efficiency.

[Arxiv]

Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage

Junqing Lin, Jingwei Sun, Mingge Lu, Guangzhong Sun

arXiv:2507.12205

[Arxiv]

Warning

Action required

Education

Honors & Awards

News

Selected Publications (view all )

Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Lua-LLM: Learning Unstructured-Sparsity Allocation for Large Language Models

Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage

Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage

All publications