An international research group has proven that the use of a new artificial intelligence method will become safer. The results of the researchers' work were presented at the international conference on natural language processing EMNLP in Singapore by Danis Alukaev, a graduate student at Innopolis University, winner of the "SMART GUY" program.
Danis Alukaev became the winner of the UMNIK program in 2020 with the project "Development of a cloud platform for detecting pathologies in biomedical images using neural networks".
Modern deep learning models are engaged in machine translation into all languages of the world, give recommendations to users of film services and marketplaces, build knowledge graphs, generate images from text and diagnose diseases from medical images. However, it is still difficult for a person to interpret the solutions of models, which can slow down the introduction of AI technologies in critical areas, for example, in medicine.
"Usually, the machine learning model is perceived as a black box: we provide some information at the "input", and we get the result at the "output". But in practice, it is important for decision-makers using AI services to understand the reasons why the result turned out this way. Most modern machine learning models are black boxes, they do not have mechanisms to explain the behavior of the model. Our scientific work and the results obtained have allowed us to get closer to understanding what is happening inside tools based on artificial intelligence," said Danis Alukaev, a student at Innopolis University.
According to a group of researchers from Russia, Denmark and the UK, explaining the decisions made by machine learning models will increase the trust of doctors and other specialists when working with artificial intelligence. So, the radiologist will see not only the diagnosis made by the AI service - pneumonia of the lung, but also that the decision was made because of the signs of "frosted glass" found on the analyzed X-ray image - areas of compaction of the lung tissue.
Scientists call a conceptualized model, or Concept Bottleneck Models, one of the approaches to increase the interpretability of solutions to deep learning models. In this approach, artificial intelligence first predicts a set of concepts describing an abstraction understandable to humans: size, position, texture, color, shape. Then, based on the obtained set of concepts, the model makes a prediction: whether the pathology of the organ is detected on the X-ray or not. The authors call this prediction a target prediction. The main idea of the researchers is to explain the target prediction of the solution of the deep learning model, it is enough to look at the predicted concepts and based on them draw a conclusion about the reliability of the prediction.
"With this approach, it is necessary to choose a set of concepts in advance. But questions arise: what data should be used to train models and how to mark up training examples, because markup takes a lot of time and is done manually by a person? For our experiments, our team used 18,620 X-ray images of various organs, medical annotations, markings of pathologies and symptoms, which we used in other AI studies in this area. The uniqueness of medical data is that they store a lot of both images and text descriptions — the conclusions of radiologists. We have developed an approach where text descriptions are used instead of a set of concepts — this fundamentally distinguishes our method from existing ones," said Ilya Pershin, Head of the Laboratory of Artificial Intelligence in Medicine at Innopolis University.
Ilya Pershin, as a student at Kazan (Volga Region) Federal University, also won the UMNIK program in 2021 with the project "Development of a software package for high-performance modeling of antennas of complex geometry in the field of telecommunications systems."
The researchers found that when using images and text together, artificial intelligence learns better and generalizes patterns better, so the model will remain stable when attacked by malicious users. In addition, the method proposed by the authors does not require manual markup of concepts — they are automatically highlighted by the model during training, which allows one to create an optimal set of concepts and not waste human resources on routine annotation.
Let us recall that in 2022, a regional representative office of the Innovation Assistance Fund was opened on the basis of the My Business Center of the Republic of Tatarstan. The My Business Center carries out the general integration of the implementation process of each specific project-the winner of the Student Startup competition with enterprises, universities, ministries and departments of the Republic of Tatarstan for further commercialization of technological projects within the framework of the national project "Small and Medium-sized Entrepreneurship".