A Comprehensive Comparison of Text Summarization Performance A Multi Faceted Evaluation of Large Lan
Автор: Computer Science & IT Conference Proceedings
Загружено: 2026-01-31
Просмотров: 12
Описание:
#ai #artificialintelligence #nlp #computerscience #tech
A Comprehensive Comparison of Text Summarization Performance: A Multi-Faceted Evaluation of Large Language Models with Practical Considerations.
Anantharaman Janakiraman and Behnaz Ghoraani , Florida Atlantic University, USA
Abstract: Text summarization is crucial for mitigating information overload across domains. This research evaluates summarization performance across 17 large language models using seven diverse datasets at three output lengths (50, 100, 150 tokens). We employ a novel multi-dimensional framework assessing factual consistency, semantic similarity, lexical overlap, and human-like evaluation while considering both quality and efficiency factors. Key findings reveal significant differences between models, with specific models excelling in factual accuracy (deepseek-v3), human-like quality (claude-3-5-sonnet), processing efficiency (gemini-1.5-flash), and cost effectiveness (gemini-1.5-flash). Performance varies dramatically by dataset, with models struggling on technical domains but performing well on conversational content. We identified a critical tension between factual consistency (best at 50 tokens) and perceived quality (best at150 tokens).
Keywords: Text Summarization, Large Language Models, Multi-dimensional Evaluation, Evaluation Metrics, Model Comparison
Abstract URL : https://csitcp.com/abstract/16/162csit07
Article full text: https://aircconline.com/csit/papers/v...
Volume URL : https://csitcp.com/volume/162
00:00 - Introduction
02:00 - Related Work
03:00 - Dataset
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: