E-MART: An Improved Misclassification Aware Adversarial Training with Entropy-Based Uncertainty Measure

Songcao Hou; Tianying Cui

doi:10.23977/jaip.2025.080118

E-MART: An Improved Misclassification Aware Adversarial Training with Entropy-Based Uncertainty Measure

Download as PDF

DOI: 10.23977/jaip.2025.080118 | Downloads: 23 | Views: 1126

Author(s)

Songcao Hou ¹, Tianying Cui ¹

Affiliation(s)

¹ School of Modern Information Industry, Guangzhou College of Commerce, Guangzhou, China

Corresponding Author

Songcao Hou

ABSTRACT

Recently, adversarial training (AT) has been demonstrated to be effective to improve deep neural network (DNN) robustness against adversarial examples. Among them, Misclassification Aware adveRsarial Training (MART) is the most promising one, which incorporates an explicit differentiation of misclassified examples as a regularizer. However, MART uses prediction error for identifying the misclassified examples, and yet it fails to achieve the greatest performance. This crux lies in the fact that the prediction error only focuses on the output probability regarding with the ground-truth label, neglecting the impact of the complement classes. In this paper, we offer a unique insight into the condition that emphasizes learning on misclassified examples, and propose an improved MART method with entropy-based uncertainty measure (termed as E-MART). Specifically, we consider the impact of the outputs from all classes and develop an entropy-based uncertainty measure (EUM) to provide reliable evidence that indicates the impact of the misclassified and correctly classified examples. Moreover, based on EUM, we conduct a soft decision scheme to optimize the loss function of AT, which help to make the efficient training for the model's final robustness. We have carried out experiments on CIFAR-10 dataset, and the experimental results demonstrate the effectiveness of our method.

KEYWORDS

Adversarial training, Misclassified example, Entropy-based uncertainty measure

CITE THIS PAPER

Songcao Hou, Tianying Cui, E-MART: An Improved Misclassification Aware Adversarial Training with Entropy-Based Uncertainty Measure. Journal of Artificial Intelligence Practice (2025) Vol. 8: 140-149. DOI: http://dx.doi.org/10.23977/jaip.2025.080118.

REFERENCES

[1] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[2] Y. Wang, X. Deng, S. Pu, Z. Huang, Residual convolutional CTC networks for automatic speech recognition, arXiv preprint arXiv:1702.07793.
[3] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language under- standing, arXiv preprint arXiv:1810.04805.
[4] M. Zeng, Y. Wang, Y. Luo, Dirichlet latent variable hierarchical recurrent encoder-decoder in dialogue generation, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 1267–1272.
[5] X. Wang, J. Li, X. Kuang, Y.-a. Tan, J. Li, The security of machine learning in an adversarial setting: A survey, Journal of Parallel and Distributed Computing 130 (2019) 12–23.
[6] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199.
[7] I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, arXiv preprint arXiv:1412.6572.
[8] D. Wu, Y. Wang, S.-T. Xia, J. Bailey, X. Ma, Skip connections matter: On the transferability of adversarial examples generated with resnets, arXiv preprint arXiv:2002.05990.
[9] H. Chen, K. Lu, X. Wang, J. Li, Generating transferable adversarial examples based on perceptually-aligned perturbation, International Journal of Machine Learning and Cybernetics 12 (11) (2021) 3295–3307.
[10] A. Kurakin, I. Goodfellow, S. Bengio, Adversarial machine learning at scale, arXiv preprint arXiv:1611.01236.
[11] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models resistant to adversarial attacks, arXiv preprint arXiv:1706.06083.
[12] F. Tram`er, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, P. McDaniel, Ensemble adversarial training: Attacks and defenses, arXiv preprint arXiv:1705.07204.
[13] C. Guo, M. Rana, M. Cisse, L. Van Der Maaten, Countering adversarial images using input transformations, arXiv preprint arXiv:1711.00117.
[14] Y. Bai, Y. Feng, Y. Wang, T. Dai, S.-T. Xia, Y. Jiang, Hilbert-based generative defense for adversarial examples, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4784–4793.
[15] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, J. Zhu, Defense against adversarial attacks using high-level representation guided denoiser, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1778–1787.
[16] W. Xu, D. Evans, Y. Qi, Feature squeezing: Detecting adversarial examples in deep neural networks, arXiv preprint arXiv:1704.01155.
[17] X. Ma, B. Li, Y. Wang, S. M. Erfani, S. Wijewickrema, G. Schoenebeck, D. Song, M. E. Houle, J. Bailey, Characterizing adversarial subspaces using local intrinsic dimensionality, arXiv preprint arXiv:1801.02613.
[18] K. Lee, K. Lee, H. Lee, J. Shin, A simple unified framework for detecting out-of-distribution samples and adversarial attacks, Advances in neural information processing systems, Vol. 31, 2018.
[19] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, A. Swami, The limitations of deep learning in adversarial settings, in: 2016 IEEE European symposium on security and privacy (EuroS&P), IEEE, 2016, pp. 372–387.
[20] N. Papernot, P. McDaniel, Extending defensive distillation, arXiv preprint arXiv:1705.05264.
[21] A. Ross, F. Doshi-Velez, Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, 2018.
[22] Q. Liu, T. Liu, Z. Liu, Y. Wang, Y. Jin, W. Wen, Security analysis and enhancement of model compressed deep learning systems under adversarial attacks, in: 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), IEEE, 2018, pp. 721–726.
[23] X. Wang, J. Li, Q. Liu, W. Zhao, Z. Li, W. Wang, Generative adversarial training for supervised and semi-supervised learning, Frontiers in Neurorobotics 16.
[24] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks, in: 2017 ieee symposium on security and privacy (sp), Ieee, 2017, pp. 39–57.
[25] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically principled trade-off between robustness and accuracy, in: International conference on machine learning, PMLR, 2019, pp. 7472–7482.
[26] J. Zhang, X. Xu, B. Han, G. Niu, L. Cui, M. Sugiyama, M. Kankanhalli, Attacks which do not kill training make adversarial learning stronger, in: International conference on machine learning, PMLR, 2020, pp. 11278–11287.
[27] G. W. Ding, Y. Sharma, K. Y. C. Lui, R. Huang, Mma training: Direct input space margin maximization through adversarial training, arXiv preprint arXiv:1812.02637.
[28] Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, Q. Gu, Improving adversarial robustness requires revisiting misclassified examples, in: International Conference on Learning Representations, 2019.
[29] A. Kurakin, I. J. Goodfellow, S. Bengio, Adversarial examples in the physical world, in: Artificial intelligence safety and security, Chapman and Hall/CRC, 2018, pp. 99–112.
[30] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, J. Li, Boosting adversarial attacks with momentum, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9185–9193.
[31] Y. Carmon, A. Raghunathan, L. Schmidt, J. C. Duchi, P. S. Liang, Unlabeled data improves adversarial robustness, Advances in Neural Information Processing Systems 32.
[32] E. Wong, L. Rice, J. Z. Kolter, Fast is better than free: Revisiting adversarial training, arXiv preprint arXiv:2001.03994.
[33] C. E. Shannon, A mathematical theory of communication, ACM SIGMOBILE mobile computing and communications review5 (1) (2001) 3–55.
[34] Y. Wang, X. Ma, J. Bailey, J. Yi, B. Zhou, Q. Gu, On the convergence and robustness of adversarial training, arXiv preprint arXiv:2112.08304.
[35] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically principled trade-off between robustness and accuracy, in: International conference on machine learning, PMLR, 2019, pp. 7472–7482.
[36] S. Zheng, Y. Song, T. Leung, I. Goodfellow, Improving the robustness of deep neural networks via stability training, in: Proceedings of the ieee conference on computer vision and pattern recognition, 2016, pp. 4480–4488.
[37] A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from tiny images , cs.toronto.edu, 2009.
[38] S. Zagoruyko, N. Komodakis, Wide residual networks, arXiv preprint arXiv:1605.07146.
[39] L. Rice, E. Wong, Z. Kolter, Overfitting in adversarially robust deep learning, in: International Conference on Machine Learning, PMLR, 2020, pp. 8093–8104.

Subscription

E-Mail Alert

Downloads:	15934
Visits:	544919

E-MART: An Improved Misclassification Aware Adversarial Training with Entropy-Based Uncertainty Measure

Author(s)

Affiliation(s)

Corresponding Author

ABSTRACT

KEYWORDS

CITE THIS PAPER

REFERENCES

RESOURCES

JOIN US

PUBLICATION SERVICES

CONTACT US