Method of Annotating a Collection of Text Documents

Authors

  • Olesia Barkovska Kharkiv National University of Radio Electronics, Ukraine
  • Vitalii Vodolazkyi Kharkiv National University of Radio Electronics, Ukraine

DOI:

https://doi.org/10.30837/csitic52021232889

Keywords:

methods, summarization, text documents, parallelization, speedup, partition, efficiency

Abstract

The work is devoted to the analysis of methods for annotating text documents, the relevance of which is due to the fact that when familiarizing with the information object presented in text form, reading the annotation by the reader is very much in demand, since it can reduce the time for selecting the necessary sources by several times. The paper considers the method of automatic annotation SumBasic, based on the probabilistic approach. An approach to data decomposition in each of the separate modules that ensure the operation of the method is proposed.

References

Serdechnyi, V., Barkovska, O., Rosinskiy, D., Axak, N., Korablyov, M., Model of the Internet Traffic Filtering System to Ensure Safe Web Surfing, Advances in Intelligent Systems and Computing, 2020, 1020, стр. 133–147.

Olesia, B., Oleg, M., Daria, P., ...Vladyslav, D., Maxim, V. Local concurrency in text block search tasks, International Journal of Emerging Trends in Engineering Research, 2020, 8(3), стр. 690–694.

Vanderwende L., Suzuki H., Brockett C., Nenkova A. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion // Information Processing and Management Journal. - 2007. - Vol. 43, № 6. – P. 1606-1618. URL: http://citeseerx.ist.psu.edu/viewdoc/down load?doi= 10.1.1.105.9491 &rep=rep 1 &ty pe=pdf

Vanderwende L., Suzuki H., Brockett C. Microsoft Research at DUC2006: Task-Focused Summarization with Sentence Simplification and Lexical Expansion // Proceedings of the Document Understanding Conference. - 2007. URL: http://citeseerx.ist.psu.edu/viewdoc/do wnload?doi= 10.1.1.114.2486&rep:=repl&ty pe=pdf

Nenkova, A. and L. Vanderwende. The impact of frequency on summarization // Microsoft Research Technical Report, MSR-TR-2005-101. - 2005. URL: http://www.cs.bgu.ac.il/~elhadad/nlp09/sumbasic.pdf

Rankel P., Conroy J., Dang H., Nenkova A. A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art // Proceedings of The 51st Annual Meeting of the Association for Computational Linguistics. - 2013.-P. 131-136. URL: http://aclweb.Org/anthology/P/P13/P13-2024.pdf

Published

2021-05-30

Issue

Section

DESIGN, IMPLEMENTATION AND OPERATION OF INFORMATION SYSTEMS AND TECHNOLOGIES