Publication: Agent data merging
Дата
2020
Авторы
Antonov, E.
Lopatina, E.
Ionkina, K.
Tretyakov, E.
Journal Title
Journal ISSN
Volume Title
Издатель
Аннотация
© 2020 The Authors. Published by Elsevier B.V.The present article deals with data collection in a given field using the agent-based technologies from various information sources of the Internet with the aim to ob-tain reliable and up-to-date data. The agent-based approach is illustrated by the data collection on the nuclear power plants operating all over the world. Three open information sources have been selected for data extraction. The information sources concerned have been analyzed and the features of data provision structure identified. In the course of the present work the following tools for the develop-ment of the software agents have been described: The browser control for human behavior simulation, HTML markup analysis using the XPath query language and data extraction from PDF-documents using regular expressions. Above all, the article considers the software architecture and the database scheme. In the re-sult of the software operation, data regarding 789 nuclear power plants has been obtained.
Описание
Ключевые слова
Цитирование
Agent data merging / Antonov, E. [et al.] // Procedia Computer Science. - 2020. - 169. - P. 473-478. - 10.1016/j.procs.2020.02.222