Method of Annotating a Collection of Text Documents
DOI:
https://doi.org/10.30837/csitic52021232889Keywords:
methods, summarization, text documents, parallelization, speedup, partition, efficiencyAbstract
The work is devoted to the analysis of methods for annotating text documents, the relevance of which is due to the fact that when familiarizing with the information object presented in text form, reading the annotation by the reader is very much in demand, since it can reduce the time for selecting the necessary sources by several times. The paper considers the method of automatic annotation SumBasic, based on the probabilistic approach. An approach to data decomposition in each of the separate modules that ensure the operation of the method is proposed.
References
Serdechnyi, V., Barkovska, O., Rosinskiy, D., Axak, N., Korablyov, M., Model of the Internet Traffic Filtering System to Ensure Safe Web Surfing, Advances in Intelligent Systems and Computing, 2020, 1020, стр. 133–147.
Olesia, B., Oleg, M., Daria, P., ...Vladyslav, D., Maxim, V. Local concurrency in text block search tasks, International Journal of Emerging Trends in Engineering Research, 2020, 8(3), стр. 690–694.
Vanderwende L., Suzuki H., Brockett C., Nenkova A. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion // Information Processing and Management Journal. - 2007. - Vol. 43, № 6. – P. 1606-1618. URL: http://citeseerx.ist.psu.edu/viewdoc/down load?doi= 10.1.1.105.9491 &rep=rep 1 &ty pe=pdf
Vanderwende L., Suzuki H., Brockett C. Microsoft Research at DUC2006: Task-Focused Summarization with Sentence Simplification and Lexical Expansion // Proceedings of the Document Understanding Conference. - 2007. URL: http://citeseerx.ist.psu.edu/viewdoc/do wnload?doi= 10.1.1.114.2486&rep:=repl&ty pe=pdf
Nenkova, A. and L. Vanderwende. The impact of frequency on summarization // Microsoft Research Technical Report, MSR-TR-2005-101. - 2005. URL: http://www.cs.bgu.ac.il/~elhadad/nlp09/sumbasic.pdf
Rankel P., Conroy J., Dang H., Nenkova A. A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art // Proceedings of The 51st Annual Meeting of the Association for Computational Linguistics. - 2013.-P. 131-136. URL: http://aclweb.Org/anthology/P/P13/P13-2024.pdf