UPDF AI

Arabic Extractive Summarization Using Pre-Trained Models

Yasmin Einieh,A. Almansour,A. Jamal

2023 · DOI: 10.4197/comp.12-1.6
Journal of King Abdulaziz University-Computing and Information Technology Sciences · 0 Citations

TLDR

This study aims to fill the gap on Arabic language summarization by experimenting with several models for summarizing Arabic text, including QARiB, AraELECTRA, and AraBERT-base models, all trained using the KALIMA dataset.

Abstract

Automatic Text Summarization (ATS) is a crucial area of study in Natural Language Processing (NLP) due to the vast amount of online information available. Extractive summarization, which involves selecting important sentences from the original document without altering their wording, is one approach to generating summaries. While many methods for Arabic text summarization exist, deep learning applications are still in their early stages, and there is a shortage of available datasets. Unlike English, there have been fewer experiments conducted on Arabic language summarization due to its unique characteristics. This study aims to fill this gap by experimenting with several models for summarizing Arabic text, including QARiB, AraELECTRA, and AraBERT-base models, all trained using the KALIMA dataset. The AraBERT model performed exceptionally well, achieving high scores of 0.44, 0.26, and 0.44 on the ROUGE-1, ROUGE-2, and ROUGE-L measures, respectively.