Skip to main content

Authorship classification techniques: Bridging textual domains and languages

2024, vol.16 , no.1, pp. 27-38

Article [2024-01-03]

Arta Misini
Arbana Kadriu
Ercan Canhasi

Authorship classification analyzes an author's prior work to identify their writing style, a unique trait of each language and individual author. This research aims to conduct a thorough comparative analysis of various methods for classifying authorship. The study leverages two corpora: AAALitCorpus of Albanian literary texts and CCAT10 of English columns. We evaluate model-generated features across different configurations. The richness of the features and the breadth of the analysis provide a significant understanding of the problem, setting a new standard for comprehensive linguistic investigations across multiple languages. The study indicates that machine learning algorithms accurately discern authorial writing styles, highlighting the complexities of classifying authorship in a cross-linguistic context.


natural language processing, authorship classification, textual data, feature space, multiclass classification


Download full article

Citation of this article:

Arta Misini, Arbana Kadriu, Ercan Canhasi. Authorship classification techniques: Bridging textual domains and languages. International Journal on Information Technologies and Security, vol.16 , no.1, 2024, pp. 27-38.