Introduction to Data Poison Attacks on Machine Learning Models

Dmitry Namiot


This article discusses one of the possible classes of attacks on machine learning systems - poisoning attacks. Classically, poisoning attacks are special modifications of the training data, which are designed to influence the model obtained after training in a necessary way for the attacker. Attacks can be aimed at lowering the overall accuracy or fairness of the model, or at, for example, providing the necessary classification results under certain conditions. The technique for implementing such attacks includes algorithms for determining the elements of training data that are most responsible for learning outcomes (for generated generalizations), minimizing the amount of poisoned data, and also for ensuring maximum invisibility of the changes being made. Among poisoning attacks, the most dangerous are the so-called trojans (backdoors), when, by means of specially prepared training data, they achieve a change in the logic of the model for a certain labeled input data. In addition to modifying training data, poisoning attacks also include direct attacks on ready-made machine learning models or their executable code.

Full Text:

PDF (Russian)


Ilyushin, Eugene, Dmitry Namiot, and Ivan Chizhov. "Attacks on machine learning systems-common problems and methods." International Journal of Open Information Technologies 10.3 (2022): 17-22.(in Russian)

Kostyumov, Vasily. "A survey and systematization of evasion attacks in computer vision." International Journal of Open Information Technologies 10.10 (2022): 11-20.

Artificial Intelligence in Cybersecurity. (in Russian) Retrieved: Dec, 2022

Major ML datasets have tens of thousands of errors Retrieved: Dec, 2022

ONNX Retrieved: Dec, 2022

Fickling Retrieved: Dec, 2022


HuggingFace Retrieved: Dec, 2022

TensorFlow Hub Retrieved: Dec, 2022

Parker, Sandra, Zhe Wu, and Panagiotis D. Christofides. "Cybersecurity in process control, operations, and supply chain." Computers & Chemical Engineering (2023): 108169.

Kurita, Keita, Paul Michel, and Graham Neubig. "Weight poisoning attacks on pre-trained models." arXiv preprint arXiv:2004.06660 (2020).

Costales, Robby, et al. "Live trojan attacks on deep neural networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020.

Namiot, Dmitry, Eugene Ilyushin, and Oleg Pilipenko. "On Trusted AI Platforms." International Journal of Open Information Technologies 10.7 (2022): 119-127.

Bagdasaryan, Eugene, and Vitaly Shmatikov. "Blind backdoors in deep learning models." Usenix Security. 2021.

Li, Qingru, et al. "A Label Flipping Attack on Machine Learning Model and Its Defense Mechanism." Algorithms and Architectures for Parallel Processing: 22nd International Conference, ICA3PP 2022, Copenhagen, Denmark, October 10–12, 2022, Proceedings. Cham: Springer Nature Switzerland, 2023.

Mahloujifar, Saeed, Mohammad Mahmoody, and Ameer Mohammed. "Universal multi-party poisoning attacks." International Conference on Machine Learning. PMLR, 2019.

Steinhardt, Jacob, Pang Wei W. Koh, and Percy S. Liang. "Certified defenses for data poisoning attacks." Advances in neural information processing systems 30 (2017).

Gao, Yansong, et al. "Backdoor attacks and countermeasures on deep learning: A comprehensive review." arXiv preprint arXiv:2007.10760 (2020).

Tavallali, Pooya, et al. "Adversarial Poisoning Attacks and Defense for General Multi-Class Models Based On Synthetic Reduced Nearest Neighbors." arXiv preprint arXiv:2102.05867 (2021).

Fowl, Liam, et al. "Adversarial examples make strong poisons." arXiv preprint arXiv:2106.10807 (2021).

Kuprijanovskij, V. P., et al. "Roznichnaja torgovlja v cifrovoj jekonomike." International Journal of Open Information Technologies 4.7 (2016): 1-12.

Namiot, Dmitry, and Manfred Sneps-Sneppe. "Context-aware data discovery." 2012 16th International Conference on Intelligence in Next Generation Networks. IEEE, 2012.

Nikolaev, D. E., et al. "Cifrovaja zheleznaja doroga-innovacionnye standarty i ih rol' na primere Velikobritanii." International Journal of Open Information Technologies 4.10 (2016): 55-61.

Solans, David, Battista Biggio, and Carlos Castillo. "Poisoning attacks on algorithmic fairness." Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part I. Cham: Springer International Publishing, 2021.

Shafahi, Ali, et al. "Poison frogs! targeted clean-label poisoning attacks on neural networks." Advances in neural information processing systems 31 (2018).

Huang, W. Ronny, et al. "Metapoison: Practical general-purpose clean-label data poisoning." Advances in Neural Information Processing Systems 33 (2020): 12080-12091.

Geiping, Jonas, et al. "Witches' brew: Industrial scale data poisoning via gradient matching." arXiv preprint arXiv:2009.02276 (2020).

Liu, Xin, et al. "Dpatch: An adversarial patch attack on object detectors." arXiv preprint arXiv:1806.02299 (2018).

Influence functions Retrieved: Dec, 2022

Influence Instances Retrieved: Dec, 2022

Yang, Qiang, et al. "Federated machine learning: Concept and applications." ACM Transactions on Intelligent Systems and Technology (TIST) 10.2 (2019): 1-19.

Fung, Clement, Chris JM Yoon, and Ivan Beschastnikh. "Mitigating sybils in federated learning poisoning." arXiv preprint arXiv:1808.04866 (2018).

Gu, Tianyu, et al. "Badnets: Evaluating backdooring attacks on deep neural networks." IEEE Access 7 (2019): 47230-47244.

Wang, Yue, et al. "Stop-and-go: Exploring backdoor attacks on deep reinforcement learning-based traffic congestion control systems." IEEE Transactions on Information Forensics and Security 16 (2021): 4772-4787.

Adi, Yossi, et al. "Turning your weakness into a strength: Watermarking deep neural networks by backdooring." 27th {USENIX} Security Symposium ({USENIX} Security 18). 2018.

Gao, Yansong, et al. "Strip: A defence against trojan attacks on deep neural networks." Proceedings of the 35th Annual Computer Security Applications Conference. 2019.

S. Shen, S. Tople, and P. Saxena, “Auror: Defending against poisoning attacks in collaborative deep learning systems,” in Proceedings of the 32Nd Annual Conference on Computer Security Applications, ser. ACSAC ’16. New York, NY, USA: ACM, 2016, pp. 508–519

Chen, Xinyun, et al. "Targeted backdoor attacks on deep learning systems using data poisoning." arXiv preprint arXiv:1712.05526 (2017).

Altoub, Majed, et al. "An Ontological Knowledge Base of Poisoning Attacks on Deep Neural Networks." Applied Sciences 12.21 (2022): 11053.

TrojAI - Trojans in Artificial Intelligence Retrieved: Dec, 2022

Chen, Jian, et al. "De-pois: An attack-agnostic defense against data poisoning attacks." IEEE Transactions on Information Forensics and Security 16 (2021): 3412-3425.

Lin, Yi-Shan, Wen-Chuan Lee, and Z. Berkay Celik. "What do you see? Evaluation of explainable artificial intelligence (XAI) interpretability through neural backdoors." arXiv preprint arXiv:2009.10639 (2020).

Narcissus Clean-label Backdoor Attack Retrieved: Dec, 2022


  • There are currently no refbacks.

Abava  Кибербезопасность MoNeTec 2024

ISSN: 2307-8162