Publication: The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model
| dc.contributor.author | Moloshnikov, I. | |
| dc.contributor.author | Selivanov, A. | |
| dc.contributor.author | Rylkov, G. | |
| dc.contributor.author | Rybka, R. | |
| dc.contributor.author | Sboev, A. | |
| dc.contributor.author | Сбоев, Александр Георгиевич | |
| dc.date.accessioned | 2024-12-26T10:53:03Z | |
| dc.date.available | 2024-12-26T10:53:03Z | |
| dc.date.issued | 2022 | |
| dc.description.abstract | © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.The Internet contains a large amount of heterogeneous information, the extraction and structuring of which is currently a relevant task. This is especially relevant for tasks of social importance, in particular the analysis of the experience of using pharmaceutical products. In this paper, we propose a two-step sequential algorithm for extracting named entities and the relationships between them. Its creation was made possible by the availability of a marked-up corpus of Internet users’ reviews of medicines (Russian Drug Review Corpus). The basis of the algorithm is the language model XLM-RoBERTa-sag, which is pre-trained on a large corpus of unlabeled texts of reviews. The developed algorithm achieves the accuracy of identifying related entities: 71.6 and relations: 80.5, which is the first estimate of the accuracy of the solution of the considered problem on the Russian-language drug review texts. | |
| dc.format.extent | С. 463-471 | |
| dc.identifier.citation | The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model / Moloshnikov, I. [et al.] // Studies in Computational Intelligence. - 2022. - 1032 SCI. - P. 463-471. - 10.1007/978-3-030-96993-6_51 | |
| dc.identifier.doi | 10.1007/978-3-030-96993-6_51 | |
| dc.identifier.uri | https://www.doi.org/10.1007/978-3-030-96993-6_51 | |
| dc.identifier.uri | https://www.scopus.com/record/display.uri?eid=2-s2.0-85127627366&origin=resultslist | |
| dc.identifier.uri | http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=WOS_CPL&DestLinkType=FullRecord&UT=WOS:000833484200051 | |
| dc.identifier.uri | https://openrepository.mephi.ru/handle/123456789/28967 | |
| dc.relation.ispartof | Studies in Computational Intelligence | |
| dc.title | The Two-Stage Algorithm for Extraction of the Significant Pharmaceutical Named Entities and Their Relations in the Russian-Language Reviews on Medications on Base of the XLM-RoBERTa Language Model | |
| dc.type | Conference Paper | |
| dspace.entity.type | Publication | |
| oaire.citation.volume | 1032 SCI | |
| relation.isAuthorOfPublication | fc2d63d7-5260-41ba-a952-0420c8848b13 | |
| relation.isAuthorOfPublication.latestForDiscovery | fc2d63d7-5260-41ba-a952-0420c8848b13 | |
| relation.isOrgUnitOfPublication | ba0b4738-e6bd-4285-bda5-16ab2240dbd1 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | ba0b4738-e6bd-4285-bda5-16ab2240dbd1 |