Zoltan Gal, Erzsébet Tóth

Deep Learning-Based Analysis of Ancient Greek Literary Texts in English Version: A Statistical Model Based on Word Frequency and Noise Probability for the Classification of Texts

In our paper we intend to present a methodology that we elaborated for clustering texts based on the word fre quency in the English translations of selected old Greek texts. We used the classification system of the ancient Library of Alex andria, devised by the prominent Greek scholar-poet, Callima chus in the 3rd century BC., as a basis for categorizing literary masterpieces. In our content analysis, we could determine a tri plet of a, b, c values for describing a power function that appro priately fits a curve determined by the word frequencies in the texts. In addition, we have discovered 16 special features of the different texts that correspond to various token categories inves tigated in each text, such as part of speech of the word in the con text, numerals, subordinate conjunction, symbols, etc. We have developed a cognitive model in which several hundred different subtexts were utilized for supervised learning with the aim of subtext class recognition. Concerning 200 subtexts, the triplet of a, b, c values, the classes of the subtexts, and their 16-dimen sional feature vectors were learnt for the Recurrent Neural Net work (RNN). It turned out that the Long-Short Term Memory RNN could efficiently predict which class a chosen subtext could be categorized into without considering the interpretation of the content. The influence of the non-zero error rate of new com munication services on the meaning of the transferred texts was also investigated. The impact of the noise on the classification accuracy was found to be linear, dependent on the character error rate.

Reference:

DOI:  10.36244/ICJ.2024.5.1

Download  

Please cite this paper the following way:

Zoltan Gal, Erzsébet Tóth, "Deep Learning-Based Analysis of Ancient Greek Literary Texts in English Version: A Statistical Model Based on Word Frequency and Noise Probability for the Classification of Texts", Infocommunications Journal, Joint Special Issue on Cognitive Infocommunications and Cognitive Aspects of Virtual Reality, 2024, pp. 2-11, https://doi.org/10.36244/ICJ.2024.5.1