Izvestiya vuzov. Yadernaya Energetika

The peer-reviewed scientific and technology journal. ISSN: 0204-3327

Application of Machine Learning Methods for Filling and Updating Nuclear Knowledge Databases

12/14/2022 2022 - #04 Personnel training

Telnov V.P. Korovin Yu.A.

DOI: https://doi.org/10.26583/npe.2022.4.11

UDC: 004.8

The paper considers issues involved in design and creation of knowledge databases in the field of nuclear science and technology. Results from searching for and investigating suitable algorithms to classify and semantically annotate the textual network content for the convenience of computer-aided filling and updating of scalable semantic repositories (knowledge bases) in the field of nuclear physics and nuclear power are presented in Russian and English. The proposed algorithms will provide a methodological and technological basis for creating problem-oriented knowledge databases as artificial intelligence systems, as well as prerequisites for developing semantic technologies to acquire new knowledge via the Internet without direct human participation. The machine learning algorithms under investigation are tested by cross-validation method using field-specific text corpora. The novelty of the presented study is defined by the application of the Pareto’s optimality principle for multi-criteria evaluation and ranking of the algorithms under investigation in the absence of a priori information about the comparative significance of the criteria. The project is implemented in accordance with semantic web standards (RDF, OWL, SPARQL, etc.). There are no technological limits for integrating the knowledge bases created with third-party data repositories, with meta-search, library, reference and question-answering systems. The proposed software solutions are based on cloud computing using the DBaaS and PaaS service models to ensure that data repositories and network services are scalable. The software built is publicly available and free to copy.

References

  1. CERN Document Server. Available at: https://cds.cern.ch (accessed Jun. 26, 2022).
  2. Centre for Photonuclear Experiments Data. Available at: http://cdfe.sinp.msu.ru/index.en.html (accessed Jun. 26, 2022)
  3. IAEA Nuclear Knowledge Management. Available at: https://www.iaea.org/topics/nuclear-knowledge-management (accessed Jun. 26, 2022)
  4. Rosatom State Corporation. Knowledge Management System. Available at: http://www.innov-rosatom.ru/suz-rosatoma/ (accessed Jun. 26, 2022)
  5. Telnov V., Korovin Yu. Machine Learning and Text Analysis in the Tasks of Knowledge Graphs Refinement and Enrichment. CEUR Workshop Proceedings. 2020, v. 2790, pp. 48-62. Supplementary Proceedings of the XXII International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2020), Voronezh, Russia, October 13-16, 2020, EID: 2-s2.0-85098723055, ISBN: 16130073. Available at: http://ceur-ws.org/Vol-2790/paper06.pdf (accessed Jun. 26, 2022)
  6. Telnov V., Korovin Yu. Semantic Web and Interactive Knowledge Graphs as Educational Technology. In: Cloud Computing Security, ed. Dinesh G. Harkut, IntechOpen, London, 2020, ISBN: 978-1-83880-703-0; DOI: https://doi.org/10.5772/intechopen.83221 .
  7. Telnov V., Korovin Yu. Semantic Web and Knowledge Graphs as an Educational Technology of Personnel Training for Nuclear Power Engineering. Nuclear Energy and Technology. 2019, no. 5 (3), pp. 273-280; DOI: https://doi.org/10.3897/nucet.5.39226 .
  8. Telnov V., Korovin Yu. Semantic Web and Knowledge Graphs as an Educational Technology of Personnel Training for Nuclear Power Engineering. Izvestiya vuzov. Yadernaya Energetika. 2019, no. 2, p. 219-229; DOI: http://doi.org/10.26583/npe.2019.2.19 .
  9. Telnov V., Korovin Yu. Programming Knowledge Graphs, Reasoning on Graphs. Software Engineering. 2019, no. 2, pp. 59-68; DOI: https://doi.org/10.17587/prin.10.59-68 .
  10. Telnov V. Semantic Educational Web Portal. Selected Papers of the XIX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/ RCDL 2017), Moscow, Russia, October 9-13, 2017, http://ceur-ws.org/Vol-2022. Available at: http://ceur-ws.org/Vol-2022/paper11.pdf (accessed Jun. 26, 2022).
  11. Semantic Educational Portal. Nuclear Knowledge Graphs. Intelligent Search Agents. Available at: http://vt.obninsk.ru/x/ (accessed Jun. 26, 2022).
  12. Knowledge Graphs on Computer Science. Intelligent Search Agents. Available at: http://vt.obninsk.ru/s/ (accessed Jun. 26, 2022).
  13. W3C Semantic Web. Available at: https://www.w3.org/standards/semanticweb/ (accessed Jun. 26, 2022).
  14. W3C RDF Schema 1.1. Available at: https://www.w3.org/TR/rdf-schema/ (accessed Jun. 26, 2022).
  15. W3C OWL 2 Web Ontology Language. Available at: https://www.w3.org/TR/owl2-overview/ (accessed Jun. 26, 2022).
  16. Geron A. Hands-on ML with Scikit-Learn, Keras & TensorFlow. 2nd edn. O’Reilly Media, Inc. Boston, 2019.
  17. Scikit-learn. Machine Learning in Python. Available at: https://scikit-learn.org/stable/ (accessed Jun. 26, 2022).
  18. Naive Bayes Classifier. Available at: https://scikit-learn.org/stable/modules/naive_bayes.html (accessed Jun. 26, 2022).
  19. Classification Metrics. Available at: https://github.com/turi-code/userguide/blob/master/evaluation/classification.md (accessed Jun. 26, 2022)
  20. ISO/IEC 19505–2:2012(E) Information technology – Object Management Group Unified Modeling Language (OMG UML) – Part 2: Superstructure. ISO/IEC, Geneva (2012).
  21. Manning C., Surdeanu M., Bauer J., Finkel J., Bethard S., McClosky D. The Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of the LIInd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, 2014, pp. 55-60. Available at: https://aclanthology.org/P14-5010.pdf (accessed Jun. 26, 2022); DOI: https://doi.org/10.3115/v1/P14-5010 .
  22. Machine Learning with MATLAB & Simulink. Available at: https://www.mathworks.com/solutions/machine-learning.html (accessed Jun. 26, 2022)
  23. Stupnikov S., Kalinichenko A. Extensible Unifying Data Model Design for Data Integration in FAIR Data Infrastructures. Proceedings of the XX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/ RCDL 2018), Springer, 2019, pp. 17-39; DOI: https://doi.org/10.1007/978-3-030-23584-0_2 .

semantic web knowledge database machine learning classification semantic annotation cloud computing

Link for citing the article: Telnov V.P., Korovin Yu.A. Application of Machine Learning Methods for Filling and Updating Nuclear Knowledge Databases. Izvestiya vuzov. Yadernaya Energetika. 2022, no. 4, pp. 122-133; DOI: https://doi.org/10.26583/npe.2022.4.11 (in Russian).