A Robust Combinatorial Defensive Method Based on GCN

: Graph Convolutional Neural Networks (GCNs) often demonstrate poor robustness when faced with adversarial attacks, which can be generated with malicious intent. Several heuristic defensive methods have been proposed to mitigate this issue, but they are often vulnerable to stronger adaptive attacks. Recently, researchers have shown that the non-robust aggregation functions used in GCNs are responsible for their vulnerability, and adversarial training in the popular space can enhance the model's accuracy and robustness. Building on this prior research, this paper analyzes the robustness of the winsorised mean function and the mean aggregation function from the perspective of model interpretability, based on the theory of breakdown points and influence function robustness. We propose an improved robust combinatorial defensive method, WLGCN, which replaces the mean aggregation function in the GCN operator with the more robust winsorised mean aggregation function, and incorporates a robust adversarial regularizer on the manifold space hidden layer H (1) of the GCN. Finally, we evaluate the robustness of the proposed model under different levels of adversarial perturbation cost, using accuracy and classification margin as evaluation metrics. The experimental results demonstrate that the proposed defensive approach can effectively enhance the model's robustness against adversarial attacks while maintaining model accuracy, when compared to other baselines.


Introduction
With the advent of deep learning, convolutional neural networks have made remarkable progress in various fields, including computer vision and natural language processing.However, traditional convolutional neural networks are only suitable for processing Euclidean space data, such as images and text, which have the characteristic of translation invariance [1].On the other hand, graph data, a type of non-Euclidean data, has gained widespread attention due to its prevalence in modeling complex relationships in the real world, such as social network relationships, transportation relationships, and protein structure relationships.The local structure of each node in the graph may be vastly different, making translation invariance no longer applicable.To solve this issue, researchers have extended convolutional eural networks to graph data, resulting in graph convolutional neural networks.GCNs have shown great promise in extracting features from graph data, which enables us to perform various downstream tasks, including node classification, link prediction, and graph classification.
However, recent research has revealed that deep learning models are susceptible to adversarial perturbations, which can severely affect their output.For instance, minor variations in a few pixels in an image, although imperceptible to the human eye, can cause significant changes in the model's output.Similarly, graph neural network models are vulnerable to adversarial perturbations, such as adding or deleting edges, modifying node features, etc. Adversarial attacks on graph convolutional neural networks can have severe consequences, particularly in practical applications such as social networks, where malicious users can easily create fake followers to add false information, manipulate online comments, and product websites, or deceive target users to mislead analysis systems [2].
Therefore, the security issue of graph convolutional neural networks is currently one of the research hotspots.In-depth research on graph adversarial attacks and countermeasures can promote their successful application in a wider range of fields.Compared to other areas of deep learning, graph adversarial attacks are more challenging because graph attributes are not only affected by perturbations, but also discrete structures.Thus, developing robust countermeasures against graph adversarial attacks is crucial to ensure the reliability and trustworthiness of GCN models in practical applications.
We organized the remainder of the study as follows.To start with, Section 2 introduces the current researches that are related to our work.The preliminary definitions of GCNs, the attack and defense unified modeling are given in Section 3. Section 4 analyzed the robustness of the aggregation function.Section 5 illustrates the combinatorial defensive method we used that including winsorisedconv and the latent adversarial training.In Section 6, we provide detailed results and experimental analysis.We draw the conclusions in Sections 7.

Related Works
In recent years, research on adversarial attacks and defenses in graph convolutional neural networks (GCNs) has received increasing attention from researchers.Zügner et al. [3] were among the first to propose the Nettack attack algorithm for graph adversarial learning, which modifies node data features and their connections to generate small adversarial perturbations guided by a scoring function.This sparked a wave of research on adversarial attacks on GCNs, with subsequent studies conducted by Dai [4], Wang [5], Zhou [6], Sun [7], and others.
Concurrently, research on defense methods against adversarial attacks on GCNs has also gained momentum.Feng [8] et al. introduced Graph Adversarial Training (GAT) as a robust defense method based on dynamic regularization using graph structure.Zhu et al. [9] proposed a samplebased Batch Virtual Adversarial Training to enhance the model's robustness.According to the study by Günnemann et al. [10], GCN defense methods against adversarial attacks can be broadly classified into three categories: 1) Data pre-processing [11,12]: For instance, graph purification, which purifies the perturbed graph to obtain a clean graph and trains the GCN model on it.
2) Model training [13,14,15]: For example, adversarial training, which trains the model by labeling adversarial samples with the correct label, giving the model defense capability against corresponding attack methods.However, this method is limited by the attack methods and cannot defend against unknown attacks.
3) Model architecture modification [16]: For example, introducing attention mechanisms to learn how to differentiate between adversarial perturbations and clean samples, and training a robust GCN model by penalizing the weights of adversarial nodes or edges.
In summary, a significant body of literature has emerged that focuses on the development of adversarial attacks and defenses in GCNs.The proposed defense methods offer a promising avenue for mitigating the impact of adversarial attacks, and further research is required to enhance their effectiveness and robustness.

Definition of the GCNs
A graph convolutional neural network is a deep learning model designed for processing graphstructured data.It utilizes a neighbor aggregation strategy and a message passing mechanism to learn representations and perform classification tasks on nodes.
Formally, given an attribute graph , where is the adjacency matrix and .The goal of the GCN is to map the nodes in the graph to their corresponding class labels, by iteratively aggregating the features of neighboring nodes to update the representation of the target node.
The definition of the l layer of graph convolutional neural networks is as follows [17]: The representation vector of the target node is obtained by taking into account the information of its neighboring nodes The information is then combined with the representation vector of the target node from the previous layer, using an aggregation function () l AGG , to obtain the message vector.The message vector is then passed to the target node, using a normalized adjacency matrix A, a weight function () l W , and an activation function () l  , to compute the representation vector () l v h of the target node.
The GCN employs a neighbor aggregation strategy and a message passing mechanism to iteratively update the representation vector of a target node, by aggregating and transferring the information from its neighboring nodes.As shown in Figure 1, to compute the representation vector of the target node at layer l, the information of its neighboring nodes is first obtained.Then, the information is combined with the target node's own representation vector from layer 1 l  , using an aggregation function, to obtain the representation vector of the target node at layer l ., where (0) A is the original adjacency matrix and (0) X is the original node feature matrix.Let ( , ) G A X  %% % be the graph obtained by adding adversarial perturbations, where structural attacks are applied to the adjacency matrix A and feature attacks are applied to the node feature matrix X .Let ∆ be the cost of adversarial perturbations,  be the model parameters obtained by training on a set of instances, and ( , )   f A X  be the graph convolutional neural network model.The goal of the attacker is to maximize the loss function of the target node t v on ( , )   f A X  in order to achieve the desired attack effect, which can be defined as follows [17]: , where 0 || || AA  % represents the number of non-zero elements in a vector.The constraint controls the size of the perturbations and limits the total number of modifications on the node feature matrix and the adjacency matrix to ∆.

Definition of the Defense Unified Modeling
As research on graph neural network attacks has progressed, the study of defensive methods against GNN-based smuggling has also made rapid progress, proposing corresponding defensive strategies for different attacker models.In this section, we provide a general definition of defensive models for graph data adversarial attacks and their related concepts: Definition as: [18] * min ( ( , ), ) Let G be an original network or a perturbed network.The goal of defense is to minimize the loss function of the attacked model, making it as close as possible to the loss of the model that has not been attacked.

Definition of the Winsorised Mean
Winsorised mean: a method for handling outliers that is different from truncating the mean (removing outliers) or treating them equally as the sample mean.It limits the influence of outliers within a certain threshold.Specifically, in order statistics data, it replaces the values of the top 100α% (0 ≤ α ≤ 0.5) with the value of the upper segment median and the values of the bottom 100α% with the value of the lower segment median.Finally, the adjusted statistical data sample is averaged using the following mathematical expression [19]:

Aggregation Function Robustness Analysis
The message passing mechanism is the core of graph convolutional neural networks (GCNs), and the commonly used aggregation function in existing GCN models based on message passing is the mean aggregation function.The sample mean is widely used but has no resistance to outliers.If one or more outlier samples exist in a sample, it may lead to a complete breakdown of the model's output, so it needs to be carefully considered when applied.When the sample data is more scattered or has a large range, the sample median is more robust, but the sample median is only one data point and is not fully utilized.An intuitive idea is to use the Winsorised mean, which is robust to outliers, for processing.

Breakdown point theory analysis
The theory of breakdown point is used to measure the robustness of a function f under data perturbations.The breakdown point m can be intuitively understood as the minimum number of data points that need to be added to a data sample set, in order to make the output of the function f diverge to infinity.
Definition: The breakdown point is defined as the minimum perturbation value that causes the function f to breakdown, where N is the set of all possible perturbations.The breakdown point is calculated as follows [20]: The concept of the "breakdown point" has been widely used in robust statistics.Chen et al. [19] found that the mean function is non-robust based on the crash point theory.The crash point of the mean function is 1/ (| | 1) v N  , which means that in the worst case, only a small perturbation is needed to make the output of the function go to infinity.In contrast, to make the upper bound of the winsorised mean tend to infinity, at least ⌊αn⌋+1 perturbed data points with infinite values need to be injected into the function.Compared with the crash point of the mean function, which is v N  , the crash point of the winsorised mean is higher, indicating its higher robustness to outliers.

Influence Function Robust Estimation Analysis
Robust Estimation.It refers to selecting appropriate methods to minimize the influence of outliers and gross errors in data samples, in order to obtain the best estimate.A method is considered to be "robust" when the results obtained from it closely match the true values, despite the presence of outliers.If the estimated values are significantly different from the true values, it indicates poor performance of the method and suggests that the outliers are affecting the model.[21].
Influence Function.The influence function refers to the measure of the robustness of an estimator, and the corresponding robustness metric can be obtained through the influence function.This concept was initially proposed by Hampel [22] based on the concept of infinitesimals, and is defined as: where F is the distribution function, T is the estimator, and x  is the dot product.If it is a finite sample, the corresponding empirical influence function can be obtained [19]: where ( 1) () is the empirical influence function, and  is a coefficient related to  .

Analysis of the Influence Function of the Winsorised Mean.
Taking the difference between the trimmed mean of 1 n  observations and that of n observations yields [19]: (1 ) , , By substituting the sample influence function formula (7) and considering symmetry, the sample influence function of the trimmed mean can be obtained as [19]: , , The impact function of the winsorised mean is shown in Figure 2, and it can be seen that the impact function is a bounded jump function.The impact function provides a measure of robustness, and it indicates that the winsorised mean is more robust than the arithmetic mean, as it can resist the influence of outliers.In contrast, the impact function of the mean is unbounded, and the mean is very sensitive to outliers, lacking any robustness.

Graph hidden adversarial regularization
Miyato et al. [23] pointed out that perturbing word embeddings does not affect the mapping to any word and proposed this method as a robust classifier for normal text.Meanwhile, Stutz et al. [19] showed that adversarial instances can simultaneously improve robustness and accuracy if they are on a low-dimensional embedding of popular samples.To address similar problems in Graph Convolutional Networks, a direct analogue of perturbing word embeddings in GCNs is perturbing the output of the first hidden layer, denoted as (1)  H which combines node features and graph information.In this paper, we use a proxy of the latent popular space and inject robust adversarial regularization terms to indirectly perturb graph and node information, implicitly enhancing the model's robustness against structural attacks.Experimental results in Section 6 show that this helps to reduce the success rate of GCN under adversarial attacks (robustness) while maintaining or improving the model's accuracy.The model framework is illustrated in the figure 3.
The forward propagation formula for the GCN model is shown as follows: ( ), 0 where 0 HX  represents the initial node representation.For the sake of symbol expression convenience, let's assume that all nodes are represented by d-dimensional vectors at all layers, denoted as () . Consider a standard two-layer GCN model that tries to find the optimal weight parameters (1) ( : ( , ) WW   , in order to minimize the model output loss function f  , that is min ( , )   f G X  .The hidden layer combines the structure of the model graph and node information, and can directly perform adversarial training on it as follows: [24] (1) min max ( ) where f  is the loss function based on the perturbation amount  at (1)  H .The imperceptible vibration noise defined as : { :|| || , {1,..., }} . The perturbation amount in (11) is jointly chosen over all nodes in the graph setting, which is different from the common adversarial setting where each individual adversarial sample seeks its own perturbation.The result is a high computational cost, which further exacerbates the nested minmax optimization.To alleviate this problem, we further adopt adversarial training with a standard regularization variant, aimed at improving the smoothness of the model's perturbation predictions, as shown in Equation ( 13). (1) (1) min ( , ) : Here, γ is a balancing parameter, and the regularizer R  is defined as the Frobenius distance between the original model output (the second layer) and the output perturbation.
After simplification, (13) becomes as follows: [24] (1) (2) 2 : To find the perturbation parameter  , the perturbation parameter  is as follows: Here, the (2) T W W W  . The overall procedure is summarized in Algorithm 1.
Algorithm 1 Hidden Adversarial Regularization for GCN input: A, X While not converged for (13) do While not converged for ( 14) do Apply ADAM to find *  (gradient in  from Eq (15)) Take one step of ADAM in  with the gradient computed by

The Robust Combined Defense Method
Through the analysis of the robustness of the winsorised mean and mean function in Section 4, this paper proposes an improved robust defensive method WLGCN based on the mainstream message-passing mechanism graph convolutional neural network framework.The specific implementation method is to incorporate potential adversarial perturbation training into the hidden layers, and to select a more robust trimmed mean aggregation function to replace the mean aggregation function when designing the graph convolutional operator.The overall architecture of the model is shown in Figure 4.

Experiments and Analysis
In order to evaluate the efficacy of the defensive method of the WLGCN in this paper, the study conducted experiments on three real datasets and two graph neural network attack models.The effectiveness of this method was compared with the latest to defensive method validate its performance.

Datasets and Evaluation Metrics
This study conducted research on three real datasets, including Cora [25], Cora-ML [26], and Citeseer dataset.Table 1 provides a statistical description of the datasets.The maximum connected component (LCC) of the datasets was calculated.NLCC and ELCC represent the maximum connected component of the node set and the maximum connected component of the edge set, respectively.
1) Cora dataset: The Cora dataset is a citation network dataset that contains a large number of academic papers, classified into 7 categories.It consists of 2485 articles and 5096 citation records, with each node containing 1433 features.
2) Cora-ML dataset: The Cora-ML dataset is a citation network dataset that contains a large number of academic papers related to machine learning, classified into 7 categories.It consists of 2810 articles and 7981 citation records, with each node containing 2879 features.
3) Citeseer dataset: The Citeseer dataset is also a citation network dataset that contains 2110 academic papers and 3668 citation relationships, classified into 6 categories, with 3703 features.
The degree distribution of Cora, Cora-ML, and Citeseer datasets are shown in Figure 5.We can find the majority of nodes are of low degree.Classification Margin: To evaluate the effectiveness of the attacks, we use Classification Margin as a measure, which represents the maximum distance from the misclassified target node to the correct class boundary.The formula is as follows [27]: Here, t v is the target node, H is the model output before the target node t v is attacked, and H  is the model output after the target node t v is attacked.

Attack Algorithms
In this experiments, two classic adversarial attack algorithms with strong attack performance are used, namely NETTACK target attack algorithm and Metattack non-target attack algorithm.The following is a brief introduction to these two algorithms 1) NETTACK [28] is an algorithm that first selects candidate edges and features based on important data characteristics.It then designs two evaluation functions to assess the change in the target confidence after modifying the candidate edges and features.Finally, it updates the adversarial network iteratively by modifying the highest scoring edge or feature.
2) Metattack [29] is a global attack algorithm that treats the input network G as a hyperparameter and constructs a bi-level optimization problem.It utilizes the meta-gradient based on network edges to iteratively update the adversarial network.

Baselines
To verify the effectiveness of the proposed robust defense method WLGCN, this paper compares it with GCN and three other benchmark defense methods, namely GWNN, AGNN, DGAT, and GCN.The following briefly introduces these four defense methods.
GWNN: A novel graph convolutional neural network (GCN) that uses graph wavelet transform to solve the drawbacks of previous spectral GCN methods that relied on graph Fourier transform [30].Unlike graph Fourier transform, graph wavelet transform can be obtained through fast algorithms without matrix decomposition, which reduces computation costs and provides good interpretability for GCN.
AGNN: A variant of GCN that performs semi-supervised classification on graph-structured data, where the model uses an efficient layer-wise propagation rule based on spectral graph convolution that approximates the first-order proximity [31].
DGAT: Adversarial training (AT) is a regularization technique that has been shown to improve the robustness of models against perturbations in image classification.Directed graph adversarial training (DGAT) incorporates graph structure into the adversarial process and automatically identifies the impact of perturbations from neighboring nodes, introducing additional adversarial regularization to defend against worst-case perturbations.
DGAT can resist the impact of adversarial perturbations in worst-case scenarios and reduce the impact of perturbations from neighboring nodes [32].
GCN: A scalable semi-supervised learning method for graph-structured data that uses an efficient layer-wise propagation rule, where the specific spectral-domain graph convolution adopts a weighted averaging method to aggregate messages from neighboring nodes [33].
GCN-W: A GCN variant model based on Winsorised Convolution, which we designed, is employed in the ablation study of the experimental section.
GCN-L: A variant of the GCN model based on Latent Adversarial Training.The model is utilized in the ablation study of the experimental section.

Adversarial Attack and Defense Experiments
In the defense process, two main issues need to be addressed: 1) maintaining the performance of graph neural network models on clean samples, and 2) minimizing the impact of adversarial attacks on the performance based on the first issue.
Accuracy of the Model before Attack.Due to the winsorised mean aggregation employed by WLGCN, some extreme value information may be discarded during the aggregation process, which may result in a decrease in the accuracy of this method.To verify the accuracy of this approach, this paper conducted 10 experiments on node classification tasks based on three types of original clean graph datasets before adversarial attacks, and took the average value.The results are shown in Table 2.It can be observed that the proposed WLGCN method achieves the best performance on both the Cora and Citeseer datasets.The performance on the Cora-ML dataset is only slightly lower than that of the best-performing model, LATGCN.These results indicate that although the proposed model aggregation function is adjusted to discard some extreme values during winsorised mean aggregation, the accuracy of the model has not decreased, and the overall accuracy of the model has been improved by introducing potential adversarial perturbation training in the manifold space (1)   H  Robustness of Models after Attack.In adversarial training, to further explore the robustness of different models, we utilized NETTACK, a potent and inconspicuous graph adversarial attack algorithm, to conduct our experiment.The degree distribution of nodes in all three datasets displayed a low-degree distribution with small degree values, as illustrated in Figure 5. Consequently, we devised an attack that imposed 0-9 perturbed edges to each target node, with the addition of 9 perturbed edges regarded as a substantial degree of perturbation.Our aim was to assess the efficacy of the defense algorithm against attacks of varying degrees of perturbation.
By recording the results of adversarial training averaged over 10 runs, the overall performance of the robustness of different models is obtained, using 9 q0 q q cm    as the robustness indicator, where q is the size of the perturbation, and q cm is the classification margin under the attack perturbation size q .The smaller the value of this measure, the stronger the robustness of the corresponding model.
From Table 3, it can be observed that the proposed method is superior to the baseline methods under both direct and indirect attacks, indicating that the proposed improved model has high robustness.Compared with indirect attacks, it can be found that the model's robustness indicator data under direct attacks is much larger than that under indirect attacks, which indicates that all models are more susceptible to the influence of direct attacks.This is consistent with the previous research results of Zhu et al [32], showing that the effectiveness of attacks by directly manipulating and modifying the target node features is higher than that of manipulating other nodes to affect the target node.
In the ablation study, WLGCN demonstrates superior robustness performance in most cases, except for Cora (Direct) and Cora-ML where the model's robustness performance slightly lags behind GCN-L and GCN-W.The CM metric represents the distance between a target node and the correct classification boundary.Therefore, when a node is correctly classified, the corresponding model output confidence should be higher.On the Citeseer dataset, the proposed WLGCN method outperforms other methods significantly.On the Cora dataset, when the perturbation amount is 1, all models perform well because the minimum degree of the Cora dataset is 2 (including self-loops), making it difficult for a single perturbation to change the model output.When the perturbation amount is greater than 2, the performance of GWNN and other baseline methods drops rapidly, while WLGCN can still maintain high performance.With the increase in perturbation intensity, the CM of different models increases, which can be attributed to the fact that most nodes in the Cora dataset have few neighboring nodes.After a direct attack on the target node, the information aggregated by the aggregation function is more heavily perturbed, leading to lower model robustness.In summary, the proposed method has better performance in model robustness in adversarial attacks.
2) Global Attack: In adversarial attack methods targeting graph neural networks, there is a class of attackers who focus on modifying a small number of edges to significantly degrade the performance of the graph neural network model, rather than attacking specific nodes.Metattack is an example of such a powerful attack algorithm.In the global attack defense experiments, we adopted Metattack to attack the graph neural network and conducted high-intensity attacks by modifying the proportion of perturbed edges in the network  In this section, the sensitivity of the WLGCN model's hyperparameters α, γ, and ε is explored.In the experiment, one hyperparameter was fixed, and another hyperparameter was given a value while the other hyperparameters were fixed at their optimal values.The effect of different thresholds on the performance of WLGCN was studied by changing the other hyperparameters.Specifically, the value range of α was adjusted from 0.1 to 0.5, γ from 0 to 0.9, and ε from 0 to 0.6.This experiment takes the three datasets with a clean graph and uses the accuracy as the evaluation metric.The performance changes of the model are shown in the figure 9.

Conclusion
Despite achieving impressive performance, graph convolutional neural networks suffer from robustness issues.In this paper, we address the robustness issue of graph convolutional neural networks by investigating the non-robustness of aggregation functions.Inspired by the theory of breakdown point and influence function, we propose to use the more robust winsorised mean aggregation function and incorporate potential adversarial regularization into the (1)  H layer of the message passing-based GCN.The robust combinatorial defensive method, named WLGCN, achieves improved robustness against graph attacks without sacrificing classification accuracy.We evaluate the performance of our proposed model under different perturbation costs using Nettack targeted attack and Metattack global attack methods.Extensive experiments on real datasets are conducted to evaluate the model's performance using accuracy and classification margin as evaluation metrics.We also perform parameter sensitivity analysis on the model.The experimental results demonstrate that our proposed method achieves high robustness while maintaining model accuracy.
[0,1] D X  represents the D-dimensional feature vector of each node.The set of nodes is denoted as {1, 2,..., } VN  and the feature set as {1, 2,..., } FD  .The labels of a subset of nodes L VV  are drawn from a set of classes {1, 2,..., } k FC 

Figure 5 :
Figure 5: The three datasets degree distribution In this study, we use Accuracy, Classification Margin (CM) and the variant of the CM to assess the performance of the models.Classification Margin: To evaluate the effectiveness of the attacks, we use Classification Margin as a measure, which represents the maximum distance from the misclassified target node to the correct class boundary.The formula is as follows[27]:

Figure 6 :
Figure 6: Training curves of the GCN and WLGCN on the training and validation of three datasets Model Convergence before Attack.Next, we investigated the convergence behavior of the WLGCN and GCN models during the training process.Specifically, in this section, we observed the training and validation performance of the GCN and WLGCN models on the three different datasets after each training epoch, as shown in the Figure 6.It can be seen that the performance of both GCN and WLGCN becomes stable after 100 epochs on the different datasets, indicating that the designed improvements to WLGCN did not affect the convergence speed of the model.Robustness of Models after Attack.In adversarial training, to further explore the robustness of different models, we utilized NETTACK, a potent and inconspicuous graph adversarial attack algorithm, to conduct our experiment.The degree distribution of nodes in all three datasets displayed a low-degree distribution with small degree values, as illustrated in Figure5.Consequently, we devised an attack that imposed 0-9 perturbed edges to each target node, with the addition of 9 perturbed edges regarded as a substantial degree of perturbation.Our aim was to

Analysis of Adversarial Training Perturbations. 1 )
Targeted Attack: In adversarial training, to investigate the robustness of the model under different levels of attack perturbation, this paper records the model classification robustness under different datasets and perturbation amounts under Nettack direct attack, as shown in Figure 7.

Figure 7 :
Figure 7: The class margin curves of the models under direct targeted attack . The experiments were used to evaluate the performance of various defense methods based on the node classification accuracy of the model.The results of the experiments are shown in Figures 8.It is easy to observe that WLGCN and WGCN exhibit better overall robustness compared to other defense methods.It can be seen that Winsorised Conv, after design improvements, plays a major role in defending against global attacks.

Figure 8 :
Figure 8: The accuracy curves of the models under global attack

Table 1 :
The statistical description of the datasets

Table 2 :
The classification accuracy

Table 3 :
The model robustness under direct and indirect attacks (Nettack)