The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model

Moloshnikov, I.; Selivanov, A.; Rylkov, G.; Rybka, R.; Sboev, A.; Сбоев, Александр Георгиевич

Publication:
The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model

dc.contributor.author	Moloshnikov, I.
dc.contributor.author	Selivanov, A.
dc.contributor.author	Rylkov, G.
dc.contributor.author	Rybka, R.
dc.contributor.author	Sboev, A.
dc.contributor.author	Сбоев, Александр Георгиевич
dc.date.accessioned	2024-12-26T10:53:03Z
dc.date.available	2024-12-26T10:53:03Z
dc.date.issued	2022
dc.description.abstract	© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.The Internet contains a large amount of heterogeneous information, the extraction and structuring of which is currently a relevant task. This is especially relevant for tasks of social importance, in particular the analysis of the experience of using pharmaceutical products. In this paper, we propose a two-step sequential algorithm for extracting named entities and the relationships between them. Its creation was made possible by the availability of a marked-up corpus of Internet users’ reviews of medicines (Russian Drug Review Corpus). The basis of the algorithm is the language model XLM-RoBERTa-sag, which is pre-trained on a large corpus of unlabeled texts of reviews. The developed algorithm achieves the accuracy of identifying related entities: 71.6 and relations: 80.5, which is the first estimate of the accuracy of the solution of the considered problem on the Russian-language drug review texts.
dc.format.extent	С. 463-471
dc.identifier.citation	The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model / Moloshnikov, I. [et al.] // Studies in Computational Intelligence. - 2022. - 1032 SCI. - P. 463-471. - 10.1007/978-3-030-96993-6_51
dc.identifier.doi	10.1007/978-3-030-96993-6_51
dc.identifier.uri	https://www.doi.org/10.1007/978-3-030-96993-6_51
dc.identifier.uri	https://www.scopus.com/record/display.uri?eid=2-s2.0-85127627366&origin=resultslist
dc.identifier.uri	http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=WOS_CPL&DestLinkType=FullRecord&UT=WOS:000833484200051
dc.identifier.uri	https://openrepository.mephi.ru/handle/123456789/28967
dc.relation.ispartof	Studies in Computational Intelligence
dc.title	The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model
dc.type	Conference Paper
dspace.entity.type	Publication
oaire.citation.volume	1032 SCI
relation.isAuthorOfPublication	fc2d63d7-5260-41ba-a952-0420c8848b13
relation.isAuthorOfPublication.latestForDiscovery	fc2d63d7-5260-41ba-a952-0420c8848b13
relation.isOrgUnitOfPublication	ba0b4738-e6bd-4285-bda5-16ab2240dbd1
relation.isOrgUnitOfPublication.latestForDiscovery	ba0b4738-e6bd-4285-bda5-16ab2240dbd1

Коллекции

Публикации

Publication: The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model

Файлы

Коллекции

Publication:
The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model