Publication:
To the question of data-driven identification of author's age for Russian texts with age deceptions using machine learning

Дата
2019
Авторы
Litvinova, T.
Sboev, A.
Rybka, R.
Moloshnikov, I.
Gudovskikh, D.
Journal Title
Journal ISSN
Volume Title
Издатель
Научные группы
Организационные подразделения
Организационная единица
Институт ядерной физики и технологий
Цель ИЯФиТ и стратегия развития - создание и развитие научно-образовательного центра мирового уровня в области ядерной физики и технологий, радиационного материаловедения, физики элементарных частиц, астрофизики и космофизики.
Выпуск журнала
Аннотация
© 2019 Published under licence by IOP Publishing Ltd.In this work we compare data-driven approaches to the task of author's age identification for Russian texts with age deception. The data corpus has been specially gathered with crowdsourcing for this task. Two ways to determine age deception in author texts are considered and compared: The first is a traditional task of identification of age group of a text author, the second is identification of the occurrence of age imitation in the text with its type (imitating higher age or imitating lower age). The best results obtained by LinearSVC model with vector of TF-IDF features of character n-grams as input data demonstrate the F1-score of about 80% for the second task, and for the first one it is about 44%.
Описание
Ключевые слова
Цитирование
To the question of data-driven identification of author's age for Russian texts with age deceptions using machine learning / Litvinova, T. [et al.] // Journal of Physics: Conference Series. - 2019. - 1205. - № 1. - 10.1088/1742-6596/1205/1/012049
Коллекции