Textual information in the form of digital documents quickly accumulates to create huge amounts of data. The majority of these documents are unstructured: it is unrestricted text and has not been organized into traditional databases. Processing documents is therefore a perfunctory task, mostly due to a lack of standards. It has thus become extremely difficult to implement automatic text analysis tasks. Automatic Text Summarization (ATS), by condensing the text while maintaining relevant information, can help to process this ever-increasing, difficult-to-handle, mass of information.
This book examines the motivations and different algorithms for ATS. The author presents the recent state of the art before describing the main problems of ATS, as well as the difficulties and solutions provided by the community. The book provides recent advances in ATS, as well as current applications and trends. The approaches are statistical, linguistic and symbolic. Several examples are also included in order to clarify the theoretical concepts.
Part 1. Foundations
1. Why Summarize Texts?.
2. Automatic Text Summarization: Some Important Concepts.
3. Single-Document Summarization.
4. Guided Multi-Document Summarization.
Part 2. Emerging Systems
5. Multi and Cross-Lingual Summarization.
6. Source and Domain-Specific Summarization.
7. Text Abstracting.
8. Evaluating Document Summaries.
Juan-Manuel Torres-Moreno is Associate Professor at the Université d'Avignon et des Pays de Vaucluse (UAPV) in France and is head of the research team Natural Language Processing (NLP/TALNE) at the Laboratoire Informatique d’Avignon (LIA). His current research lies within the field of NLP where he is investigating techniques for ATS. His other research interests include sentence compression, information retrieval, machine learning and artificial consciousness.
Table of Contents
PDF File 61 Kb