Sobiad Atıf Dizini

Citation Number 1

Downloands 3

Article Detail

Citation Owners

Text complexity and linguistic features: Their correlation in English and Russian

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

Abstract:

Text complexity assessment is a challenging task requiring various linguistic aspects to be taken into consideration. The complexity level of the text should correspond to the reader’s competence. A too complicated text could be incomprehensible, whereas a too simple one could be boring. For many years, simple features were used to assess readability, e.g. average length of words and sentences or vocabulary variety. Thanks to the development of natural language processing methods, the set of text parameters used for evaluating readability has expanded significantly. In recent years, many articles have been published the authors of which investigated the contribution of various lexical, morphological, and syntactic features to the readability level. Nevertheless, as the methods and corpora are quite diverse, it may be hard to draw general conclusions as to the effectiveness of linguistic information for evaluating text complexity due to the diversity of methods and corpora. Moreover, a cross-lingual impact of different features on various datasets has not been investigated. The purpose of this study is to conduct a large-scale comparison of features of different nature. We experimentally assessed seven commonly used feature types (readability, traditional features, morphological features, punctuation, syntax frequency, and topic modeling) on six corpora for text complexity assessment in English and Russian employing four common machine learning models: logistic regression, random forest, convolutional neural network and feedforward neural network. One of the corpora, the corpus of fiction literature read by Russian school students, was constructed for the experiment using a large-scale survey to ensure the objectivity of the labeling. We showed which feature types can significantly improve the performance and analyzed their impact according to the dataset characteristics, language, and data source.

Keywords:

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

0

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

Citation Owners

Attention!
To view citations of publications, you must access Sobiad from a Member University Network. You can contact the Library and Documentation Department for our institution to become a member of Sobiad.

Off-Campus Access
If you are affiliated with a Sobiad Subscriber organization, you can use Login Panel for external access. You can easily sign up and log in with your corporate e-mail address.

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Field : Sosyal, Beşeri ve İdari Bilimler

Journal Type : Uluslararası

Metrics

Article : 916

Cite : 2.148

Details

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Abstract
Listen the Abstract

Author : --

Journal :

Issue

Year

Type

Citation Count

View PDF

Relevant Articles
Article Who Cited This Publication

Relevant Articles	Author	#

Article	Author	#

User Guide

Menu

Mendeley

Endnote

Text complexity and linguistic features: Their correlation in English and Russian

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

Abstract:

Keywords:

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

0

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

Citation Owners

Attention!
To view citations of publications, you must access Sobiad from a Member University Network. You can contact the Library and Documentation Department for our institution to become a member of Sobiad.

Off-Campus Access
If you are affiliated with a Sobiad Subscriber organization, you can use Login Panel for external access. You can easily sign up and log in with your corporate e-mail address.

Similar Articles

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Metrics

User Guide

Menu

Mendeley

Endnote

Text complexity and linguistic features: Their correlation in English and Russian

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

Abstract:

Keywords:

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

0

2022

Journal:

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Author:

DOI:

10.22363/2687-0088-30132

Citation Owners

Attention! To view citations of publications, you must access Sobiad from a Member University Network. You can contact the Library and Documentation Department for our institution to become a member of Sobiad.

Off-Campus AccessIf you are affiliated with a Sobiad Subscriber organization, you can use Login Panel for external access. You can easily sign up and log in with your corporate e-mail address.

Similar Articles

Vestnik Rossijskogo Universiteta Družby Narodov: Seriâ Lingvistika

Metrics

Attention!
To view citations of publications, you must access Sobiad from a Member University Network. You can contact the Library and Documentation Department for our institution to become a member of Sobiad.

Off-Campus Access
If you are affiliated with a Sobiad Subscriber organization, you can use Login Panel for external access. You can easily sign up and log in with your corporate e-mail address.