Investigating Abstractive Summarization with Metric Reviews, Model Experiments, and a New Consistency Score
Ren, Yixuan, Computer Science - School of Engineering and Applied Science, University of Virginia
Ji, Yangfeng, EN-Comp Science Dept, University of Virginia
Despite their impressive performance in natural language generation tasks, Large Language Models (LLMs) still face critical challenges in text summarization. In particular, the performance of LLMs in abstractive text summarization and the limitations of existing evaluation frameworks warrant further investigation. In this work, we present a comprehensive analysis of summarization evaluation metrics, covering lexical overlap, semantic distance, factual consistency, and recent LLM-based methods. Employing these metrics as evaluation tools, we empirically assess the performance of summarization models across the LLaMA, and Gemma model families, utilizing datasets from diverse domains to provide an examination of the capabilities of current LLMs in abstractive text summarization tasks. To address limitations of current metrics, we introduce the concept of self-consistency and propose a novel consistency score to assess the reliability of text summarization models.
MS (Master of Science)
Abstractive Text Summarization, Text Generation, Natural Language Processing
English
2025/04/16