THE USE OF COMPUTER LINGUISTICS METHODS IN ESTABLISHING THE AUTHORSHIP OF THE TEXT

Abstract

Currently, there are neither services nor individual programs, algorithms and modules whose job is to compare the owner's texts to establish whether the owner is the author of all the texts being checked. In the future, we will talk about a potential algorithm, the idea of which is described above. Checking the n-th number of texts to determine whether they have one or several authors is a task for which it is not necessary to use powerful equipment and spend time and resources on training a separate neural network, whose task would be to try to find similar objects in a huge database.
The paper describes a method of fast analysis of text to establish authorship by dividing the text into elements and their separate analysis without using a large amount of time for data processing. A program for calculating and visualizing the analysis of the text has been developed, which can be entered into programs in the common format *.doc or *.docx. The results of such analysis were investigated on a number of works by different authors and different subjects. Works can be compared within the framework of one author for the integrity of the author's works or several authors among themselves for possible borrowing of material. The algorithm does not give a guaranteed answer and can be used only as a basis for additional verification of works.

At the moment, it is relevant to check a large number of texts because of the need to establish its originality. It is typical to consider that in the case when the text is original, namely its originality is 90% or more, the material is the work of the author. On the other hand, it should be noted that checking the material for originality is very resource-intensive.

The proposed algorithm is relevant because it is able to expand the set of capabilities of existing services to establish the originality of texts, as well as reduce the load on their computing power, since most of the already developed options use artificial intelligence algorithms, the speed of which depends on both the implementation algorithm and the capacity of the host system.

Keywords: text analysis, anti-plagiarism, authenticity of authorship, computer linguistics.

Downloads

Download data is not yet available.
Published
2023-05-22
How to Cite
Zalevska, O., Vanin, V., Savchuk, B., Sytnyk, A., & Zhu, S. (2023). THE USE OF COMPUTER LINGUISTICS METHODS IN ESTABLISHING THE AUTHORSHIP OF THE TEXT. Modern Problems of Modeling, (24), 37-43. https://doi.org/10.33842/2313-125X-2022-24-37-43

Most read articles by the same author(s)