Publication:
Relation Extraction from Texts Containing Pharmacologically Significant Information on base of Multilingual Language Models

dc.contributor.authorSelivanov, A.
dc.contributor.authorGryaznov, A.
dc.contributor.authorRybka, R.
dc.contributor.authorSboev, A.
dc.contributor.authorСбоев, Александр Георгиевич
dc.date.accessioned2024-12-25T11:37:39Z
dc.date.available2024-12-25T11:37:39Z
dc.date.issued2022
dc.description.abstractIn this paper we estimate the accuracy of the relation extraction from texts containing pharmacologically significant information on base of the expanded version of RDRS corpus, that contains texts of internet reviews on medications in Russian. The accuracy of relation extraction was estimated and compared for two multilingual language models: XLM-RoBERTa-large and XLM-RoBERTa-large-sag. Earlier research proved XLM-RoBERTa-large-sag to be the most efficient language model for the previous version of the RDRS dataset for relation extraction using a ground-truth named entities annotation. In the current work we use two-step relation extraction approach: automatic named entity recognition and relation extraction on predicted entities. The implemented approach gave an opportunity to estimate the accuracy of the proposed solution to the relation extraction problem, as well as to estimate the accuracy at each step of the analysis. As a result, it is shown, that multilingual XLM-RoBERTa-large-sag model achieves relation extraction macro-averaged f1-score equals to 86.4% on the ground-truth named entities, 60.1% on the predicted named entities on the new version of the RDRS corpus contained more than 3800 annotated texts. Consequently, implemented approach based on the XLM-RoBERTa-large-sag language model sets the state-of-the-art for considered type of texts in Russian.
dc.identifier.citationRelation Extraction from Texts Containing Pharmacologically Significant Information on base of Multilingual Language Models / Selivanov, A. [et al.] // Proceedings of Science. - 2022. - 429. - 10.22323/1.429.0014
dc.identifier.doi10.22323/1.429.0014
dc.identifier.urihttps://www.doi.org/10.22323/1.429.0014
dc.identifier.urihttps://www.scopus.com/record/display.uri?eid=2-s2.0-85144631312&origin=resultslist
dc.identifier.urihttps://openrepository.mephi.ru/handle/123456789/27762
dc.relation.ispartofProceedings of Science
dc.titleRelation Extraction from Texts Containing Pharmacologically Significant Information on base of Multilingual Language Models
dc.typeConference Paper
dspace.entity.typePublication
oaire.citation.volume429
relation.isAuthorOfPublicationfc2d63d7-5260-41ba-a952-0420c8848b13
relation.isAuthorOfPublication.latestForDiscoveryfc2d63d7-5260-41ba-a952-0420c8848b13
relation.isOrgUnitOfPublicationba0b4738-e6bd-4285-bda5-16ab2240dbd1
relation.isOrgUnitOfPublication.latestForDiscoveryba0b4738-e6bd-4285-bda5-16ab2240dbd1
Файлы
Коллекции