Sobiad Atıf Dizini

İndirme 2

Makale Detay

Benzer Makaleler

Dergi Bilgisi

Eseri Dinleyin

Alıntı Yap

Bu Sayfayı Yazdırın

Paylaş

THE IMPACT OF TEXT REPRESENTATION AND PREPROCESSING ON AUTHOR IDENTIFICATION

2017

Dergi:

Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering

Yazar:

DOI:

Özet:

Author identification, one of the popular topics in text classification and natural language processing, basically aims to determine the author of a given text through various analyses. In the literature, different text representation approaches and use of preprocessing steps are considered for author identification problem. This paper aims to comprehensively examine the impact of text representation and preprocessing steps on author identification specifically for Turkish language. For this purpose, the contributions of all possible combinations of different text representation approaches, namely unigram and bigram, together with the preprocessing tasks, including stemming and stop-word removal, to the performance of author identification are investigated. For the experimental evaluation, a brand new dataset is constituted. Also, two different classification algorithms, namely Multinomial Naive Bayes and Sequential Minimal Optimization, are employed. The results of the experimental analysis reveal that using bigram features alone should be avoided. Besides, it is shown that stop-words should be kept inside the text while stemming can be preferred depending on the classification algorithm so that higher performance can be achieved for author identification.

Anahtar Kelimeler:

Atıf Yapanlar

Bilgi: Bu yayına herhangi bir atıf yapılmamıştır.

Benzer Makaleler

1. STEMMING IMPACT ANALYSIS ON INDONESIAN QURAN TRANSLATION AND THEIR TAFSIR CLASSIFICATION FOR ONTOLOGY INSTANCES

2020

IIUM Engineering Journal

2. Sentiment Analysis on Twitter Based on Ensemble of Psychological and Linguistic Feature Sets

2018

Balkan Journal of Electrical and Computer Engineering

3. IN-IDRIS: MODIFICATION OF IDRIS STEMMING ALGORITHM FOR INDONESIAN TEXT

2022

IIUM Engineering Journal

4. Development of the linguometric method for automatic identification of the author of text content based on statistical analysis of language diversity coefficients

2018

Eastern-European Journal of Enterprise Technologies

5. Development of a method for determining the keywords in the slavic language texts based on the technology of web mining

2017

Eastern-European Journal of Enterprise Technologies

6. Hybrid feature selection for text classification

2012

Turkish Journal of Electrical Engineering and Computer Science

Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering

Alan : Fen Bilimleri ve Matematik; Mühendislik; Sağlık Bilimleri

Dergi Türü : Uluslararası

Metrikler

Makale : 648

Atıf : 332

2023 Impact/Etki : 0.038

Detaylı İncele

Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering

Özet
Eseri Dinleyin

Yazar : --

Dergi :

Sayı

Yıl

Tür

Atıf Sayısı

PDF Görüntüle

Benzer Makaleler
Bu Yayına Atıf Yapanlar

Benzer Makaleler	Yazar	#

Makale	Yazar	#

Kullanım Kılavuzu

Menü

Mendeley

Endnote

THE IMPACT OF TEXT REPRESENTATION AND PREPROCESSING ON AUTHOR IDENTIFICATION

2017

Dergi:

Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering

Yazar:

DOI:

10.18038/aubtda.270276

Özet:

Anahtar Kelimeler:

Atıf Yapanlar

Bilgi: Bu yayına herhangi bir atıf yapılmamıştır.

Benzer Makaleler

Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering

Metrikler