Authorship classification techniques: Bridging textual domains and languages
2024, vol.16 , no.1, pp. 27-38
Article [2024-01-03]
Authorship classification analyzes an author's prior work to identify their writing style, a unique trait of each language and individual author. This research aims to conduct a thorough comparative analysis of various methods for classifying authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and CCAT10 of English columns. We evaluate model-generated features across different configurations. The richness of the features and the breadth of the analysis provide a significant understanding of the problem, setting a new standard for comprehensive linguistic investigations across multiple languages. The study indicates that machine learning algorithms accurately discern authorial writing styles, highlighting the complexities of classifying authorship in a cross-linguistic context.
natural language processing, authorship classification, textual data, feature space, multiclass classification
https://doi.org/10.59035/UKBE1226
Arta Misini, Arbana Kadriu, Ercan Canhasi. Authorship classification techniques: Bridging textual domains and languages. International Journal on Information Technologies and Security, vol.16 , no.1, 2024, pp. 27-38. https://doi.org/10.59035/UKBE1226