Ikhtasir - A user selected compression ratio Arabic text summarization system

Aqil Azmi, Suha Al-Thanyyan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Automatic text summarization is an active research field. The rapid growth of the Web, and the associated information overloading, has injected new life into this research area. In certain languages there has been plenty of research in automatic text summarization. Arabic is not one of them. In this paper we present an automatic extractive Arabic text summarization system where the user can cap the size of the summary. The system does not require learning and employs Rhetorical Structure Theory (RST) along with a sentence scoring scheme, where individual sentences are scored. For output, sentences are selected with an objective of maximizing the overall score of the summary whose size is within the user selected compression ratio. For evaluation, system generated summaries of various lengths were compared against those performed by a professional human. Experiments on sample texts show our system outperforms some of the other existing systems including those that require learning.

Original languageEnglish
Title of host publication2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009
DOIs
StatePublished - 2009
Externally publishedYes
Event2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009 - Dalian, China
Duration: 24 Sep 200927 Sep 2009

Publication series

Name2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009

Conference

Conference2009 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2009
Country/TerritoryChina
CityDalian
Period24/09/0927/09/09

Keywords

  • Algorithms
  • Arabic text summarization
  • Natural languages
  • Text processing

Fingerprint

Dive into the research topics of 'Ikhtasir - A user selected compression ratio Arabic text summarization system'. Together they form a unique fingerprint.

Cite this