УДК 512.5:004.827
DOI: 10.36871/26189976.2026.02-2.008

Авторы

Юрий Владиславович Трофимов,
Объединенный институт ядерных исследований, Дубна, Россия; Государственный университет «Дубна», Дубна, Россия
Алексей Николаевич Аверкин,
Государственный университет «Дубна», Дубна, Россия
Александр Дмитриевич Лебедев,
Государственный университет «Дубна», Дубна, Россия
Михаил Дмитриевич Лебедев,
Федеральное государственное автономное образовательное учреждение высшего образования «Национальный исследовательский технологический университет «МИСИС», Москва, Россия
Егор Михайлович Кузнецов,
Государственный университет «Дубна», Дубна, Россия
Илья Матвеевич Голубев,
Государственный университет «Дубна», Дубна, Россия

Аннотация

В работе выполнено обзорное картирование подходов, направленных на повышение надёжности объяснений моделей машинного обучения за счёт ансамблирования, агрегирования и комплексирования, а также систематизированы метрики качества иисточники вариативности объяснений. Работа выполнена вформате обзорного картирования области иоформлена по PRISMA-ScR. Поиск охватывал публикации 2015– 2026 гг. на русском и английском языках в Scopus, Web of Science, IEEE Xplore, ACM Digital Library, OpenAlex, arXiv, OpenReview и CyberLeninka; дополнительно использовался просмотр цитирований ключевых работ. Итоговый корпус составил 106 источников. Показано, что вариативность объяснений обусловлена различиями между объяснителями, стохастикой и настройками методов, неопределённостью модели и влиянием данных. Рассмотрены три уровня ансамблирования и классы агрегаторов, включая ранговые, доверительно-взвешенные и вероятностные схемы, а для разнородных представлений — комплексирование и оценка межмодального согласия. Обсуждаются риски ложного консенсуса, манипуляции, состязательной хрупкости ивычислительные ограничения. Сформулированы инженерные рекомендации по построению конвейера объяснений с квантификацией неопределённости и обозначены направления развития к формальным гарантиям и стандартизации протоколов.

Ключевые слова

объяснимый искусственный
интеллект (XAI)
объяснимость
ансамблирование объяснений
агрегация
квантификация неопределённости
устойчивость
сдвиг распределения данных
состязательные атаки
Благодарности: работа выполнена
в рамках государственного задания
Министерства науки и высшего
образования Российской Федерации
(тема № 124112200072–2).

Список литературы

[1] Трофимов Ю. В., Аверкин А. Н. Связь доверенного искусственного интеллекта и XAI 2.0: теория и фреймворки // Мягкие измерения и вычисления. — 2025. — № 5. — С. 68–84. — DOI: 10.36871/2618-9976.2025.05.006.

[2] Aas K., Jullum M., Løland A. Explaining Individual Predictions when Features are Dependent: More Accurate Approximations to Shapley Values // Artificial Intelligence. — 2021. — Vol. 298. — Art. 103502. — DOI: 10.1016/j.artint.2021.103502.

[3] Achanta R., Shaji A., Smith K., Lucchi A., Fua P., Süsstrunk S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods // IEEE Transactions on Pattern Analysis and Machine Intelligence. — 2012. — Vol. 34, № 11. — P. 2274–2282. — DOI: 10.1109/ TPAMI.2012.120.

[4] Adadi A., Berrada M. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) // IEEE Access. — 2018. — Vol. 6. — P. 52138–52160. — DOI: 10.1109/ACCESS.2018.2870052.

[5] Adebayo J., Gilmer J., Muelly M., Goodfellow I., Hardt M., Kim B. Sanity Checks for Saliency Maps // Advances in Neural Information Processing Systems (NeurIPS). — 2018. — Vol. 31. — P. 9505–9515.

[6] Alvarez-­Melis D., Jaakkola T. S. Towards Robust Interpretability with Self-Explaining Neural Networks // Advances in Neural Information Processing Systems (NeurIPS). — 2018. — Vol. 31. — P. 7786–7795.

[7] Ancona M., Ceolini E., Öztireli C., Gross M. Towards Better Understanding of GradientBased Attribution Methods for Deep Neural Networks [Электронный ресурс] // Proceedings of the 6th International Conference on Learning Representations (ICLR). — 2018. — URL: https://openreview.net/forum?id=Sy21R9JAW (дата обращения: 16.02.2026).

[8] Bach S., Binder A., Montavon G., Klauschen F., Müller K.-R., Samek W. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation // PLOS ONE. — 2015. — Vol. 10, № 7. — e0130140. — DOI: 10.1371/journal. pone.0130140.

[9] Barredo Arrieta A., Díaz-Rodríguez N., Del Ser J., Bennetot A., Tabik S., Barbado A., García S., Gil-López S., Molina D., Benjamins R., Chatila R., Herrera F. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI // Information Fusion. — 2020. — Vol. 58. — P. 82–115. — DOI: 10.1016/j.inffus.2019.12.012.

[10] Behzadian M., Khanmohammadi Otaghsara S., Yazdani M., Ignatius J. A State-of-the-Art Survey of TOPSIS Applications // Expert Systems with Applications. — 2012. — Vol. 39, № 17. — P. 13051–13069. — DOI: 10.1016/j.eswa.2012.05.056.

[11] Bernasconi M., Choirat C., Seri R. The Analytic Hierarchy Process and the Theory of Measurement // Management Science. — 2010. — Vol. 56, № 4. — P. 699–711. — DOI: 10.1287/mnsc.1090.1118.

[12] Bhatt U., Antorán J., Zhang Y., Liao Q. V., Sattigeri P., Fogliato R., Melançon G., Krishnan R., Stanley J., Tickoo O., Nachman L., Chunara R., Srikumar M., Weller A., Xiang A. Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty // Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES). — 2021. — P. 401–413. — DOI: 10.1145/3461702.3462571.

[13] Bhatt U., Xiang A., Sharma S., Weller A., Taly A., Jia Y., Ghosh J., Puri R., Moura J. M. F., Eckersley P. Explainable Machine Learning in Deployment [Электронный ресурс] // Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAccT ’20). — 2020. — P. 648–657. — DOI: 10.1145/3351095.3375624. — URL: https://arxiv.org/abs/1909.06342 (дата обращения: 18.02.2026).

[14] Biecek P. DALEX: Explainers for Complex Predictive Models in R // Journal of Machine Learning Research. — 2018. — Vol. 19, № 84. — P. 1–5. — URL: https://jmlr.org/papers/ v19/18–416.html (дата обращения: 18.02.2026).

[15] Black E., Zhu H., Hajishirzi H., Choi Y. Model Multiplicity: Opportunities, Concerns, and Solutions // Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT). — 2022. — P. 850–863. — DOI: 10.1145/3531146.3533149.

[16] Boehmer N., Bredereck R., Peters D. Rank Aggregation Using Scoring Rules [Электронный ресурс] // Proceedings of the AAAI Conference on Artificial Intelligence. — 2022. — Vol. 36, № 5. — P. 5389–5396. — URL: https://ojs.aaai.org/index.php/AAAI/ article/view/25685 (дата обращения: 18.02.2026).

[17] Brans J.-P., Vincke Ph. How to Select and How to Rank Projects: The PROMETHEE Method // European Journal of Operational Research. — 1985. — Vol. 24, № 2. — P. 228–238.

[18] Chen C., Li O., Tao C., Barnett A. J., Su J., Rudin C. This Looks Like That: Deep Learning for Interpretable Image Recognition // Advances in Neural Information Processing Systems (NeurIPS). — 2019. — Vol. 32. — P. 8930–8941.

[19] Chen H., Janizek J. D., Lundberg S., Lee S.-I. True to the Model or True to the Data? [Электронный ресурс]. — arXiv:2006.16234, 2020. — URL: https://arxiv.org/ abs/2006.16234 (дата обращения: 16.02.2026).

[20] Conitzer V., Sandholm T. Common Voting Rules as Maximum Likelihood Estimators // Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI‑05). — 2005. — P. 145–152.

[21] Covert I., Lee S.-I. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression // Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS). — 2021. — Vol. 130. — P. 3457–3465.

[22] Covert I., Lundberg S. M., Lee S.-I. Understanding Global Feature Contributions With Additive Importance Measures // Advances in Neural Information Processing Systems (NeurIPS). — 2020. — Vol. 33. — P. 17212–17223.

[23] Dabkowski P., Gal Y. Real Time Image Saliency for Black Box Classifiers // Advances in Neural Information Processing Systems (NeurIPS). — 2017. — Vol. 30. — P. 6967–6976.

[24] Das A., Rad P. Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey [Электронный ресурс]. — arXiv:2006.11371, 2020. — URL: https://arxiv. org/abs/2006.11371 (дата обращения: 16.02.2026).

[25] DiCiccio T. J., Efron B. Bootstrap Confidence Intervals // Statistical Science. — 1996. — Vol. 11, № 3. — P. 189–228.

[26] Dietterich T. G. Ensemble Methods in Machine Learning // Multiple Classifier Systems.— Berlin: Springer, 2000. — P. 1–15.

[27] Dietz K., Fischer D., Scheid M., Müller M. Agree to Disagree: Exploring Consensus of XAI Methods for ML-based NIDS [Электронный ресурс] // Proceedings of the International Conference on Network and Service Management (CNSM). — 2024. — URL: https://dl.ifip.org/db/conf/cnsm/cnsm2024/1571071674.pdf (дата обращения: 18.02.2026).

[28] Dombrowski A.-K., Alber M., Anders C. J., Ackermann M., Müller K.-R., Kessel P. Explanations Can Be Manipulated and Geometry Is to Blame // Advances in Neural Information Processing Systems (NeurIPS). — 2019. — Vol. 32. — P. 13589– 13600.

[29] Dong X., Yu Z., Cao W., Shi Y., Ma Q. A Survey on Ensemble Learning // Frontiers of Computer Science. — 2020. — Vol. 14, № 2. — P. 241–258.

[30] Doshi-­Velez F., Kim B. Towards A Rigorous Science of Interpretable Machine Learning [Электронный ресурс]. — arXiv:1702.08608, 2017. — URL: https://arxiv.org/ abs/1702.08608 (дата обращения: 16.02.2026).

[31] Dwork C., Kumar R., Naor M., Sivakumar D. Rank Aggregation Methods for the Web // Proceedings of the 10th International Conference on World Wide Web (WWW). — 2001. — P. 613–622. — DOI: 10.1145/371920.372165.

[32] Efron B. Bootstrap Methods: Another Look at the Jackknife // Annals of Statistics. — 1979. — Vol. 7, № 1. — P. 1–26. — DOI: 10.1214/aos/1176344552.

[33] Fagin R., Kumar R., Sivakumar D. Comparing Top-k Lists // SIAM Journal on Discrete Mathematics. — 2003. — Vol. 17, № 1. — P. 134–160. — DOI: 10.1137/ S0895480102412856.

[34] Finlayson S. G., Bowers J. D., Ito J., Zittrain J. L., Beam A. L., Kohane I. S. Adversarial Attacks on Medical Machine Learning // Science. — 2019. — Vol. 363, № 6433. — P. 1287–1289. — DOI: 10.1126/science.aaw4399.

[35] Fisher A., Rudin C., Dominici F. All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously // Journal of Machine Learning Research. — 2019. — Vol. 20, № 177. — P. 1–81.

[36] Fong R. C., Vedaldi A. Interpretable Explanations of Black Boxes by Meaningful Perturbation // Proceedings of the IEEE International Conference on Computer Vision (ICCV). — 2017. — P. 3429–3437. — DOI: 10.1109/ICCV.2017.371.

[37] Gal Y., Ghahramani Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning // Proceedings of the 33rd International Conference on Machine Learning (ICML). — 2016. — Vol. 48. — P. 1050–1059.

[38] Gawlikowski J., Tassi C. R. N., Ali M., Lee J., Humt M., Feng J., Kruspe A., Triebel R., Jung P., Roscher R., Shahzad M., Yang W., Bamler R., Zhu X. X. A Survey of Uncertainty in Deep Neural Networks // Artificial Intelligence Review. — 2023. — Vol. 56. — P. 1513– 1589. — DOI: 10.1007/s10462-023-10562-9.

[39] Ghorbani A., Abid A., Zou J. Interpretation of Neural Networks is Fragile // Proceedings of the AAAI Conference on Artificial Intelligence. — 2019. — Vol. 33, № 1. — P. 3681– 3688. — DOI: 10.1609/aaai.v33i01.33013681.

[40] Goodfellow I. J., Shlens J., Szegedy C. Explaining and Harnessing Adversarial Examples [Электронный ресурс]. — arXiv:1412.6572, 2014. — URL: https://arxiv.org/ abs/1412.6572 (дата обращения: 18.02.2026).

[41] Guidotti R., Monreale A., Ruggieri S., Turini F., Giannotti F., Pedreschi D. A Survey of Methods for Explaining Black Box Models // ACM Computing Surveys. — 2019. — Vol. 51, № 5. — Art. 93. — DOI: 10.1145/3236009.

[42] Hoeting J. A., Madigan D., Raftery A. E., Volinsky C. T. Bayesian Model Averaging: A Tutorial // Statistical Science. — 1999. — Vol. 14, № 4. — P. 382–417. — DOI: 10.1214/ ss/1009212519.

[43] Holzinger A., Langs G., Denk H., Zatloukal K., Müller H. Causability and Explainability of Artificial Intelligence in Medicine // WIREs Data Mining and Knowledge Discovery. — 2019. — Vol. 9, № 4. — e1312.

[44] Hooker S., Erhan D., Kindermans P.-J., Kim B. A Benchmark for Interpretability Methods in Deep Neural Networks // Advances in Neural Information Processing Systems (NeurIPS). — 2019. — Vol. 32. — P. 9737–9748.

[45] Hryniewska-­Guzik W., Sawicki B., Biecek P. NormEnsembleXAI: Unveiling the Strengths and Weaknesses of XAI Ensemble Techniques [Электронный ресурс]. — arXiv:2401.17200, 2024. — URL: https://arxiv.org/abs/2401.17200 (дата обращения: 16.02.2026).

[46] Hüllermeier E., Waegeman W. Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods // Machine Learning. — 2021. — Vol. 110, № 3. — P. 457–506. — DOI: 10.1007/s10994-021-05946-3.

[47] Hwang C.-L., Yoon K. Multiple Attribute Decision Making: Methods and Applications — A State-of-the-Art Survey. — Berlin: Springer, 1981. — 269 p. — (Lecture Notes in Economics and Mathematical Systems; Vol. 186). — ISBN 978-3-540-10558-9.

[48] ISO/IEC 23894:2023. Information Technology — Artificial Intelligence — Guidance on Risk Management [Электронный ресурс]. — 1st ed. — 2023. — URL: https://www. iso.org/standard/77304.html (дата обращения: 18.02.2026).

[49] ISO/IEC TR 24029–1:2021. Artificial Intelligence (AI) — Assessment of the Robustness of Neural Networks — Part 1: Overview [Электронный ресурс]. — 2021. — URL: https://www.iso.org/standard/77609.html (дата обращения: 18.02.2026).

[50] Janizek J. D., Sturmfels P., Lee S.-I. Explaining Explanations: Axiomatic Feature Interactions for Deep Networks // Journal of Machine Learning Research. — 2021. — Vol. 22, № 104. — P. 1–54. — URL: https://jmlr.org/papers/v22/20–1223.html (дата обращения: 18.02.2026).

[51] Kendall A., Gal Y. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? // Advances in Neural Information Processing Systems (NeurIPS). — 2017. — Vol. 30. — P. 5574–5584.

[52] Kendall M. G. A New Measure of Rank Correlation // Biometrika. — 1938. — Vol. 30, № 1–2. — P. 81–93. — DOI: 10.1093/biomet/30.1-2.81.

[53] Koh P. W., Liang P. Understanding Black-box Predictions via Influence Functions // Proceedings of the 34th International Conference on Machine Learning (ICML). — 2017. — Vol. 70. — P. 1885–1894.

[54] Krishna S., Han T., Gu A., Pombra J., Jabbari S., Wu S., Lakkaraju H. The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective [Электронный ресурс]. — arXiv:2202.01602, 2022. — URL: https://arxiv.org/abs/2202.01602 (дата обращения: 16.02.2026).

[55] Lakshminarayanan B., Pritzel A., Blundell C. Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles // Advances in Neural Information Processing Systems (NeurIPS). — 2017. — Vol. 30. — P. 6402–6413.

[56] Lapuschkin S., Wäldchen S., Binder A., Montavon G., Samek W., Müller K.-R. Unmasking Clever Hans Predictors and Assessing What Machines Really Learn // Nature Communications. — 2019. — Vol. 10. — Art. 1096. — DOI: 10.1038/s41467-019- 08987-4.

[57] Li O., Liu H., Chen C., Rudin C. Deep Learning for Case-Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions // Proceedings of the AAAI Conference on Artificial Intelligence. — 2018. — Vol. 32, № 1. — P. 3530–3537.

[58] Lipton Z. C. The Mythos of Model Interpretability // Communications of the ACM. — 2018. — Vol. 61, № 10. — P. 36–43. — DOI: 10.1145/3233231.

[59] Longo L., Brcic M., Cabitza F. [et al.]. Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions // Information Fusion. — 2024. — Vol. 106. — Art. 102301. — DOI: 10.1016/j. inffus.2024.102301.

[60] Lundberg S. M., Erion G., Chen H., DeGrave A., Prutkin J. M., Nair B., Katz R., Himmelfarb J., Bansal N., Lee S.-I. From Local Explanations to Global Understanding with Explainable AI for Trees // Nature Machine Intelligence. — 2020. — Vol. 2, № 1. — P. 56–67. — DOI: 10.1038/s42256-019-0138-9.

[61] Lundberg S. M., Lee S.-I. A Unified Approach to Interpreting Model Predictions // Advances in Neural Information Processing Systems (NeurIPS). — 2017. — Vol. 30. — P. 4765–4774.

[62] Madry A., Makelov A., Schmidt L., Tsipras D., Vladu A. Towards Deep Learning Models Resistant to Adversarial Attacks [Электронный ресурс]. — arXiv:1706.06083, 2017. — URL: https://arxiv.org/abs/1706.06083 (дата обращения: 18.02.2026).

[63] Marx C., Calmon F., Ustun B. Predictive Multiplicity in Classification // Proceedings of the 37th International Conference on Machine Learning (ICML). — 2020. — Vol. 119. — P. 6765–6774.

[64] Mehrabi N., Morstatter F., Saxena N., Lerman K., Galstyan A. A Survey on Bias and Fairness in Machine Learning [Электронный ресурс] // ACM Computing Surveys. — 2021. — Vol. 54, № 6. — Art. 115. — DOI: 10.1145/3457607. — URL: https://arxiv. org/abs/1908.09635 (дата обращения: 18.02.2026).

[65] Miller T. Explanation in Artificial Intelligence: Insights from the Social Sciences // Artificial Intelligence. — 2019. — Vol. 267. — P. 1–38. — DOI: 10.1016/j. artint.2018.07.007.

[66] Mohseni S., Zarei N., Ragan E. D. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems // ACM Transactions on Interactive Intelligent Systems. — 2021. — Vol. 11, № 3–4. — Art. 24. — DOI: 10.1145/3387166.

[67] Molnar C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable [Электронный ресурс]. — 2nd ed. — 2022. — URL: https://christophm. github.io/interpretable-ml-book/ (дата обращения: 15.02.2026).

[68] Montavon G., Samek W., Müller K.-R. Methods for Interpreting and Understanding Deep Neural Networks // Digital Signal Processing. — 2018. — Vol. 73. — P. 1–15. — DOI: 10.1016/j.dsp.2017.10.011.

[69] Nauta M., Trienes J., Pathak S., Nguyen E., Peters M., Schmitt Y., Schlötterer J., van Keulen M., Seifert C. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI // ACM Computing Surveys. — 2023. — Vol. 55, № 13s. — Art. 295. — DOI: 10.1145/3570630.

[70] NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0) [Электронный ресурс]. — NIST AI 100–1. — January 2023. — URL: https://nvlpubs.nist.gov/nistpubs/ ai/nist.ai.100–1.pdf (дата обращения: 18.02.2026).

[71] NISTIR 8269. A Taxonomy and Terminology of Adversarial Machine Learning [Электронный ресурс]. — National Institute of Standards and Technology, 2020. — DOI: 10.6028/NIST.IR.8269. — URL: https://csrc.nist.gov/Pubs/ir/8269/IPD (дата обращения: 18.02.2026).

[72] Nüsken N., Richter L. VarGrad: A Low-Variance Gradient Estimator for Variational Inference // Advances in Neural Information Processing Systems (NeurIPS). — 2020. — Vol. 33. — P. 13481–13492.

[73] Oliveira R. I., Resende L. Trimmed Sample Means for Robust Uniform Mean Estimation and Regression [Электронный ресурс]. — arXiv:2302.06710, 2023. — URL: https:// arxiv.org/abs/2302.06710 (дата обращения: 18.02.2026).

[74] Ovadia Y., Fertig E., Ren J., Nado Z., Sculley D., Nowozin S., Dillon J. V., Lakshminarayanan B., Snoek J. Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty under Dataset Shift [Электронный ресурс] // Advances in Neural Information Processing Systems (NeurIPS). — 2019. — Vol. 32. — P. 13991– 14002. — URL: https://arxiv.org/abs/1906.02530 (дата обращения: 18.02.2026).

[75] Petkovi M., Kocev D., Dzeroski S. Fuzzy Jaccard Index: A Robust Comparison of Ordered Lists // Knowledge-Based Systems. — 2020. — Vol. 204. — Art. 106171. — DOI: 10.1016/j.knosys.2020.106171.

[76] Petsiuk V., Das A., Saenko K. RISE: Randomized Input Sampling for Explanation of Black-box Models // Proceedings of the British Machine Vision Conference (BMVC). — 2018. — P. 151.

[77] Pirie C., Wiratunga N., Wijekoon A., Moreno-García C. F. AGREE: A Feature Attribution Aggregation Framework to Address Explainer Disagreements with Alignment Metrics // Expert Systems with Applications. — 2023. — Vol. 215. — Art. 119344. — DOI: 10.1016/j.eswa.2022.119344.

[78] Reingold O., Shen J. H., Talati A. Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance [Электронный ресурс]. — arXiv:2307.07636, 2023. — URL: https://arxiv.org/abs/2307.07636 (дата обращения: 18.02.2026).

[79] Ribeiro M. T., Singh S., Guestrin C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier // Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16). — San Francisco, CA, USA, 2016. — P. 1135–1144. — DOI: 10.1145/2939672.2939778.

[80] Ribeiro M. T., Singh S., Guestrin C. Anchors: High-Precision Model-Agnostic Explanations // Proceedings of the AAAI Conference on Artificial Intelligence. — 2018. — Vol. 32, № 1. — P. 1527–1535.

[81] Rousseeuw P. J., Leroy A. M. Robust Regression and Outlier Detection. — New York: John Wiley & Sons, 1987. — 329 p. — ISBN 978-0-471-85233-9.

[82] Roy B. The Outranking Approach and the Foundations of ELECTRE Methods // Theory and Decision. — 1991. — Vol. 31, № 1. — P. 49–73.

[83] Rudin C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead // Nature Machine Intelligence. — 2019. — Vol. 1, № 5. — P. 206–215. — DOI: 10.1038/s42256-019-0048-x.

[84] Saaty T. L. The Analytic Hierarchy Process. — New York: McGraw-Hill, 1980. — 287 p. — ISBN 978-0-07-054371-3.

[85] Samek W., Binder A., Montavon G., Lapuschkin S., Müller K.-R. Evaluating the Visualization of What a Deep Neural Network Has Learned // IEEE Transactions on Neural Networks and Learning Systems. — 2017. — Vol. 28, № 11. — P. 2660–2673. — DOI: 10.1109/TNNLS.2016.2599820.

[86] Samek W., Montavon G., Lapuschkin S., Anders C. J., Müller K.-R. Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications // Proceedings of the IEEE. — 2021. — Vol. 109, № 3. — P. 247–278. — DOI: 10.1109/JPROC.2021.3060483.

[87] Samek W., Montavon G., Vedaldi A., Hansen L. K., Müller K.-R. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. — Cham: Springer, 2019. — 439 p. — (Lecture Notes in Computer Science; Vol. 11700). — DOI: 10.1007/978-3- 030-28954-6.

[88] Selvaraju R. R., Cogswell M., Das A., Vedantam R., Parikh D., Batra D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization // Proceedings of the IEEE International Conference on Computer Vision (ICCV). — 2017. — P. 618– 626. — DOI: 10.1109/ICCV.2017.74.

[89] Semenova L., Rudin C., Parr R. A Study in Rashomon Curves and Volumes: A New Perspective on Generalization and Model Simplicity in Machine Learning [Электронный ресурс]. — arXiv:1908.01755, 2019. — URL: https://arxiv.org/abs/1908.01755 (дата обращения: 16.02.2026).

[90] Shrikumar A., Greenside P., Kundaje A. Learning Important Features Through Propagating Activation Differences // Proceedings of the 34th International Conference on Machine Learning (ICML). — 2017. — Vol. 70. — P. 3145–3153.

[91] Simonyan K., Vedaldi A., Zisserman A. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps [Электронный ресурс]. — arXiv:1312.6034, 2013. — URL: https://arxiv.org/abs/1312.6034 (дата обращения: 16.02.2026).

[92] Slack D., Hilgard S., Jia E., Singh S., Lakkaraju H. Fooling LIME and SHAP: Adversarial Attacks on Post-hoc Explanation Methods // Proceedings of the 2020 AAAI/ACM Conference on AI, Ethics, and Society (AIES). — New York, NY, USA, 2020. — P. 180– 186. — DOI: 10.1145/3375627.3375830.

[93] Smilkov D., Thorat N., Kim B., Viégas F., Wattenberg M. SmoothGrad: Removing Noise by Adding Noise [Электронный ресурс]. — arXiv:1706.03825, 2017. — URL: https:// arxiv.org/abs/1706.03825 (дата обращения: 16.02.2026).

[94] Song E., Benton J. VISA: Variational Inference with Sequential Sample-Average Approximations // Journal of Machine Learning Research. — 2019. — Vol. 20, № 140. — P. 1–51.

[95] Spearman C. The Proof and Measurement of Association between Two Things // The American Journal of Psychology. — 1904. — Vol. 15, № 1. — P. 72–101. — DOI: 10.2307/1412159.

[96] Sundararajan M., Taly A., Yan Q. Axiomatic Attribution for Deep Networks // Proceedings of the 34th International Conference on Machine Learning (ICML). — 2017. — Vol. 70. — P. 3319–3328.

[97] Ueno T., Kim Y., Oura H., Seaborn K. Trust and Reliance in Consensus-Based Explanations from an Anti-Misinformation Agent [Электронный ресурс]. — arXiv:2304.11279, 2023. — URL: https://arxiv.org/abs/2304.11279 (дата обращения: 18.02.2026).

[98] Verma S., Rubin J. Fairness Definitions Explained [Электронный ресурс] // Proceedings of the 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). — 2018. — P. 1–7. — DOI: 10.1145/3194770.3194776. — URL: https://fairware.cs.umass. edu/papers/Verma.pdf (дата обращения: 18.02.2026).

[99] Wachter S., Mittelstadt B., Russell C. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR // Harvard Journal of Law & Technology. — 2017. — Vol. 31, № 2. — P. 841–887.

[100] Wilcox R. R. Modern Robust Data Analysis Methods: Measures of Central Tendency // Psychological Methods. — 2003. — Vol. 8, № 3. — P. 254–274. — DOI: 10.1037/1082- 989X.8.3.254.

[101] Xiao C., Wang W., Lin X., Yu J. X. Top-k Set Similarity Joins // Proceedings of the 2009 IEEE International Conference on Data Engineering (ICDE). — 2009. — P. 916–927. — DOI: 10.1109/ICDE.2009.111.

[102] Xu F., Uszkoreit H., Du Y., Fan W., Zhao D., Zhu J. Explainable AI: A Review of Machine Learning Interpretability Methods // Entropy. — 2019. — Vol. 21, № 1. — Art. 18. — URL: https://www.mdpi.com/1099–4300/21/1/18 (дата обращения: 18.02.2026).

[103] Yeh C.-K., Hsieh C.-Y., Suggala A., Inouye D. I., Ravikumar P. On the (In)fidelity and Sensitivity of Explanations // Advances in Neural Information Processing Systems (NeurIPS). — 2019. — Vol. 32. — P. 10965–10976.

[104] Zhang J., Bargal S., Lin Z., Brandt J., Shen X., Sclaroff S. Top-k Feature Selection via Shapley Values for Deep Neural Networks // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). — 2021. — P. 11810–11819. — DOI: 10.1109/CVPR46437.2021.01163.

[105] Zhou B., Khosla A., Lapedriza A., Oliva A., Torralba A. Learning Deep Features for Discriminative Localization // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). — 2016. — P. 2921–2929. — DOI: 10.1109/ CVPR.2016.319.

[106] Zhou J., Gandomi A. H., Chen F., Holzinger A. Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics // Electronics. — 2021. — Vol. 10, № 5. — Art. 593. — DOI: 10.3390/electronics10050593.