Оцінка алгоритмів виявлення аномалій за допомогою методів машинного навчання

Заголовок (англійською):

Evaluation of anomaly detection algorithms using machine learning methods

Автор(и):

Цюцюра М. І.

Коваленко А. Ю.

Автор(и) (англ):

Tsiutsiura M.

Kovalenko A.

Ключові слова (укр):

машинне навчання; види аномалій; алгоритми виявлення аномалій; дослідження та оцінка алгоритмів

Ключові слова (англ):

machine learning; types of anomalies; anomaly detection algorithms; research and evaluation of algorithms

Анотація (укр):

Анотація. Розглянуто значення виявлення аномалій як важливої проблеми у різних сферах діяльності програмних продуктів сучасного світу, виявлення аномалій може бути у нагоді в кібербезпеці, роботі інтернету речей, аналізі фінансових операцій. Насамперед аномалії можуть сигналізувати про необхідність вчинення будь-яких дій задля уникнення негативних наслідків. Крім того, досліджено значення виявлення аномалій для бізнес-аналітики та ризик-менеджменту. Приділено багато уваги дослідженню різних типів аномалій, зокрема точковим, контекстуальним та колективним, з наведенням прикладів у різних контекстах. Вказано на важливість використання інтелектуальних алгоритмів машинного навчання для виявлення аномалій у великих обсягах даних та швидкого опрацювання інформації з попередженням персоналу. Виявлення аномалій за допомогою машинного навчання є актуальною проблемою в сучасному світі при роботі з великими обсягами даних і постійно зростаючими загрозами у сфері кібербезпеки, фінансових шахрайств, медичної діагностики, виробничої безпеки та інших галузях. Завдяки поширенню інтернету речей (IoT) та великому обсягу даних, які вони генерують, виявлення незвичайних, аномальних або підозрілих подій стає все більш складною задачею для традиційних методів обробки даних. Машинне навчання уможливлює автоматизувати процес виявлення аномалій, використовуючи алгоритми для аналізу і класифікації даних. Це допомагає покращити ефективність і швидкість виявлення аномалій, зменшити витрати на ручний аналіз та сприяти більш точному і швидкому реагуванню на потенційні загрози або проблеми. З поглибленим розвитком технологій машинного навчання, таких як нейронні мережі, алгоритми глибокого навчання та постійне зростання моделей для навчання машини, можливості виявлення аномалій стають все більш точними та різноманітними. Це дає змогу виявляти аномалії у реальному часі та забезпечувати надійний рівень безпеки в різних сферах діяльності, що є надзвичайно важливим у сучасному цифровому світі. Виокремлюють три ситуації, в яких може застосовуватися алгоритм: контрольоване навчання, напівконтрольоване навчання та навчання без нагляду. Класифікація базується на алгоритмічному доступі, включаючи методи імовірнісні, методи вимірювання відстані та щільності, методи кластеризації, методи, що базуються на заняттях, методи реконструкції та спектральні методи. Для вибору оптимального підходу до виявлення аномалій важливо враховувати різні фактори. У статті наведено ілюстративні приклади роботи алгоритмів виявлення аномалій на основі реальних даних.

Анотація (англ):

This article discusses how anomaly detection is an important problem in various areas of software products in the modern world, anomaly detection can be useful in cybersecurity, the Internet of Things, and financial transaction analysis. Above all, anomalies can signal the need to take some action to avoid negative consequences. In addition, the importance of anomaly detection for business intelligence and risk management is explored. Much attention is paid to the study of different types of anomalies, including point, contextual and collective, with examples in different contexts. The importance of using intelligent machine learning algorithms to detect anomalies in large amounts of data and quickly process information with warnings to staff is emphasized. Anomaly detection using machine learning is an urgent problem in the modern world when dealing with large amounts of data and ever-growing threats in the field of cybersecurity, financial fraud, medical diagnostics, industrial safety and other industries. With the proliferation of the Internet of Things (IoT) and the large amount of data it generates, detecting unusual, anomalous, or suspicious events is becoming increasingly challenging for traditional data processing methods. Machine learning automates the anomaly detection process by using algorithms to analyze and classify data. This improves the efficiency and speed of anomaly detection, reduces the cost of manual analysis, and facilitates a more accurate and rapid response to potential threats or issues. With the in-depth development of machine learning technologies such as neural networks, deep learning algorithms, and the constant growth of machine learning models, anomaly detection capabilities are becoming more accurate and diverse. This makes it possible to detect anomalies in real time and ensure a reliable level of security in various fields of activity, which is extremely important in today's digital world. There are three situations in which the algorithm can be applied: supervised learning, semi-supervised learning, and unsupervised learning. The classification is based on algorithmic access, including probabilistic methods, distance and density methods, clustering methods, activity-based methods, and reconstruction and spectral methods. To choose the best approach to anomaly detection, it is important to consider various factors. The article provides illustrative examples of anomaly detection algorithms based on real data.

Публікатор:

Київський національний університет будівництва і архітектури

Назва журналу, номер, рік випуску (укр):

Управління розвитком складних систем, номер 58, 2024

Назва журналу, номер, рік випуску (англ):

Management of Development of Complex Systems, number 58, 2024

Мова статті:

Українська

Формат документа:

application/pdf

Документ:

80-85.pdf

Дата публікації:

02 Август 2024

Номер збірника:

Розділ:

ІНФОРМАЦІЙНІ ТЕХНОЛОГІЇ УПРАВЛІННЯ

Університет автора:

Державний торговельно-економічний університет, Київ

Литература:

Чандола, В., Банерджі, А., Кумар, В. Виявлення аномалій: опитування. У: ACM Computing Surveys 41, 2009.
Дуа, Д., Карра Таніскіду, Е. Репозиторій машинного навчання UCI. Каліфорнійський університет, Школа інформації та комп’ютерних наук, 2017.
Гупта, М. Виявлення викидів для часових даних: опитування. IEEE Transactions on Knowledge and Data Engineering 26, 2014.
Джолліфф, I. T. : Аналіз основних компонентів. Нью-Йорк, Берлін, Гейдельберг: Springer 2002.
ML.NET Documentation URL: https://learn.microsoft.com/en-us/dotnet/machine-learning/
Guide to Intrusion Detection and Prevention Systems (IDPS). Technical report, National Institute of Standards and Technology, U.S. Department of Commerce, 2014. С. 215–249.
Christina Warrender, Stephanie Forrest, and Barak Pearlmutter. Detecting intrusion using system calls: alternative data models. In Proceedings of the IEEE Symposium on Security and Privacy, 1999.
Zengyou He, Xiaofei Xu, and Shengchun Deng. Discovering cluster-based local outliers. Pattern Recogn. Lett.,
24 (9-10):1641-1650, June 2003.
Hawkins, Douglas M. Identification of Outliers. Chapman and Hall London; New York, 1980.
Bram Steenwinckel. Adaptive Anomaly Detection and Root Cause Analysis by Fusing Semantics and Machine Learning. European Semantic Web Conference. 2018.
Tsiutsiura M. I., Tsiutsiura S. V., and Kryvoruchko O. V. Information technologies for the development of the content of education: monographі. CP "Comprint", 2019. Kyiv: 118 p. ISBN -978-966-929-967-9.
Цюцюра М. І., Єрукаєв А. В., Гоц В. В., Костишина Н. В. Реалізація генетичного алгоритму шляхом застосування продукційних правил. Управління розвитком складних систем. Київ, 2019. № 39. С. 64 – 68. DOI:10.6084/m9.figshare.11340653.

References:

Chandola, V., Banerjee, A., Kumar, V. (2009). Anomaly detection: a survey. In: ACM Computing Surveys, 41.
Dua, D., Carra Taniskidou, E. (2017). UCI Machine Learning Repository. University of California, School of Information and Computer Sciences.
Gupta, M. (2014). Outlier detection for temporal data: a survey. IEEE Transactions on Knowledge and Data Engineering, 26.
Jolliffe, I. T. (2002). Principal component analysis. New York, Berlin, Heidelberg: Springer.
ML.NET Documentation URL: https://learn.microsoft.com/en-us/dotnet/machine-learning/
Guide to Intrusion Detection and Prevention Systems (IDPS). (2014). Technical report, National Institute of Standards and Technology, U.S. Department of Commerce, 215–249.
Warrender, Christina, Forrest, Stephanie and Pearlmutter, Barak. (1999). Detecting intrusion using system calls: alternative data models. In Proceedings of the IEEE Symposium on Security and Privacy.
Zengyou, He, Xiaofei, Xu and Shengchun, Deng. (2003). Discovering cluster-based local outliers. Pattern Recogn. Lett., 24 (9-10):1641–1650.
Hawkins, Douglas M. (1980). Identification of Outliers. Chapman and Hall London; New York.
Steenwinckel, Bram. (2018). Adaptive Anomaly Detection and Root Cause Analysis by Fusing Semantics and Machine Learning. European Semantic Web Conference.
Tsiutsiura, M. I., Tsiutsiura, S. V. and Kryvoruchko, O. V. (2019). Information technologies for the development of the content of education. Monograph. CP "Comprint". Kyiv: 118. ISBN -978-966-929-967-9.
Tsiutsiura, Mykola, Yerukaiev, Andrii, Hots, Vladyslav & Kostyshyna, Nataliia. (2019). Implementation of a genetic algorithm using product rules. Management of Development of Complex Systems, 39, 64–68. [in Ukrainian]; dx.doi.org\10.6084/m9.figshare.11340653.