A Study of the Influence of Computing Systems on The Text Vectorization Speed

Authors

  • Olesia Barkovska Kharkiv National University of Radio Electronics, Ukraine
  • Vladyslav Kholiev Kharkiv National University of Radio Electronics, Ukraine

DOI:

https://doi.org/10.30837/csitic52021232890

Keywords:

: computing system, texts, vectorization, pre-processing, graphics processor, word2vec, bag of words

Abstract

The paper considers a relevant problem of accelerating the speed of text processing. The relevance of the development and modernization of electronic library systems results from the growing need for remote access to information objects. The purpose of the study is to reduce the execution time of one of the text processing methods at the stage of information accumulation, namely, vectorization. The results show that it is possible to achieve a greater acceleration of vectorization of small texts (874 words) on multiprocessor computing systems. Systems with massive parallelism produce good results (speedup up to 4.5 times) for large texts (8348 words).

References

F.B.S.Prasdika, Dr. Bambang Sugiantoro, S.Si., M.T, “A review paper on big data and data mining”, IJID) International Journal on Informatics for Development Vol. 7, No. 1, 2018, Pp. 33-35. DOI: 10.14421/ijid.2018.07107

V.Serdechnyi, O. Barkovska,D. Rosinskiy, N. Axak, M. Korablyov, “Model of the Internet Traffic Filtering System to Ensure Safe Web Surfing”, Advances in Intelligent Systems and Computing, 2020, 1020, стр. 133–147.

Y. Goldberg, "Neural network methods for natural language processing", Synth. Lectures Hum. Lang. Technol., vol. 10, no. 1, pp. 1-309, Apr. 2017.

T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In Proceedings of ICLR Workshops Track, 2013.

Downloads

Published

2021-05-30

Issue

Section

DESIGN, IMPLEMENTATION AND OPERATION OF INFORMATION SYSTEMS AND TECHNOLOGIES