Improvement of Classiﬁcation Based on Noise and Spectral D imensionality Reduction for Hyper s pectral Image

: Hyperspectral image (HSI) classiﬁcation requires spectral dimensionality reduction and spatial ﬁltering. While common dimensionality reduction and denoising methods use linear algebra, we propose a tensorial method to jointly achieve denoising and dimensionality reduction. Firstly, we propose a new method for pre-whitening the noise (PW) in HSI. Then we propose a method based on quadtree decomposition adapted to tensor data in order to take into account the local image characteristics in the multi-way Wiener ﬁlter (LMWF) which performs both noise and spectral dimensionality reduction, referred to as PW-LMWF dr -( K 1 , K 2 , P 3 ). Classiﬁcation algorithm SVM is applied to the output of dimensionality and noise reduction methods to compare their eﬃciency : The proposed PW-LMWF dr -( K 1 , K 2 , P 3 ), PW-MWF dr -( K 1 , K 2 , P 3 ), PCA dr , MNF dr associated with Wiener ﬁltering.


Introduction
A hyperspectral image (HSI) is a multidimensional array also named as a tensor and it normally consists of hundreds of spectral bands.So, HSI data has two spatial dimensions and one spectral dimension.Hyperspectral imaging sensors provide a huge number of spectral bands, typically up to several hundreds.This unreasonably large dimension of HSI not only increases computational complexity but also degrades classification accuracy [1].Reduction of spectral dimensionality has proven necessary to apply classification algorithms.Due to its simplicity and ease of use, the most popular dimensionality reduction (DR) approaches are principal component analysis (PCA), independent component analysis (ICA), maximum noise fraction (MNF) and discrete wavelet transform (DWT) [2].But those DR methods require a preliminary data arrangement.Indeed, when dealing with tensor data a first step consists in vectorizing all images yielding matrix data, permitting the use of signal processing but neglecting spatial rearrangement.To overcome it, [3] proposes a multichannel mathematical morphology operator-based DR method which incorporates the image representation.In this paper, a tensor-based DR method is proposed to extract spectral principal components by taking into account spatial information.Moreover acquired images are unavoidably distorted by additive noise [4,5,6,7], which impairs the useful information and can degrade classification results [8].As HSIs are normally produced by a series of sensors, the noise mainly comes from two aspects: circuity noise and photonic noise [9].Although the photonic noise has become as dominant as the circuity noise in HSI data collected by new-generation hyperspectral sensors due to the improved sensitivity in the electronic components [9], the additive circuity noise is still an important part of noise.Since the denoising methods for those two types of noise are not same, we mainly focus on the reduction of the additive circuity noise in this paper and the term noise in the following will only refer to the additive circuity noise.To reduce the noise, a HSI is commonly split into vectors or matrices so any 2D filtering method could be applied, but this splitting way does not consider the related information between different bands [10,11].So, some approaches, such as tensor decomposition methods [12], have been used to remove the noise and have shown some prospects in this field [13].
A multi-way Wiener filter (MWF) [14] has been proposed to process a HSI as a whole entity based on TUCKER3 decomposition.In MWF, the filter in each mode is computed as a function of the filters in other modes, which reflects its capability in integrally exploiting the information in each mode of the multidimensional data.This model has been successfully applied in the reduction of white noise.In practice, HSIs are always distorted by non-white noise [15], but the MWF method can not deal with the cases of colored noise.
In this paper, a 3-dimensional pre-whitening method (PW) for HSIs to change the colored noise into a white one is proposed.After that MWF can be used to filter the whitened HSI(PW-MWF).Although MWF or PW-MWF preserve the data structure of HSI, they also have some negative side effects, in practice, the MWF provides, generally, blurry restored tensor.It does not consider local details.In order to preserve edges, it is necessary to apply the filtering (PW-LMWF) on the HSI's homogeneous parts.Whereas a fixed size window may not cover homogeneous parts of the image, an adapted quadtree decomposition to HSI permits to process successively homogeneous blocks.
Since reducing spectral dimensionality is an important issue in the HSI processing field [16,17] for the classification improvement, we propose in this paper a multilinear-algebra based DR method by integrating spectral DR in the PW-LMWF(K 1 , K 2 , K 3 ).This new tool is referred to as PW-LMWF dr -(K 1 , K 2 , P 3 ), where P 3 represents the number of spectral principal components.PW-LMWF dr -(K 1 , K 2 , P 3 ) is proposed to reduce simultaneously non-white noise and spectral dimensionality and to preserve the local image characteristics and hence improve the classification performance.The experiments of simulated and real-world images are given to present the performance of classification after denoising by PW-LMWF dr -(K 1 , K 2 , P 3 ) and compared to the most popular DR approaches, i.e., the principal component analysis, referred to as PCA dr , minimum noise fraction MNF dr associated with Wiener filtering and PW-MWF dr method.
The remainder of the paper is organized as follows: Section 2 introduces some basic knowledge about the multilinear algebra.Section 3 introduces the signal model.Section 4 presents the proposed method and its formulation of the classical noise reducing problem.Section 5 introduces the proposed spectral dimensionality reduction method, the PW-LMWF dr -(K 1 , K 2 , P 3 ).Section 6 contains some comparative results concerning the performance of support vector machine (SVM) classifier [18] when it is applied after either denoising and/or dimensionality reduction of HSI.Section 7 concludes the paper.Within the scope of this paper, scalar is denoted by x, vector by x, matrix by X and tensor by X .

Multilinear algebra tools
In the following, some basic multilinear algebra tools used in tensor decompositions are introduced.

n-mode unfolding
X n ∈ R In×Mn (n = 1, 2, 3) denotes the n-mode unfolding matrix of a tensor X ∈ R I1×I2×I3 , with size I n × M n where M n = I p × I q with p, q = n (p, q = 1, 2, 3).The columns of X n are the I n -dimensional vectors obtained from X by varying index i n while keeping the other indices fixed [14].

n-mode product
An n-mode product is defined as the product between a data tensor X ∈ R I1×...×I N and a matrix B ∈ R J×In in mode n and is used to extend matrix singular value decomposition.It is of size in where b j,in denotes the (j, i n ) element of matrix B and j = 1, • • • J.

Data model
A noisy HSI is modeled as a tensor R ∈ R I1×I2×I3 , which means I 1 × I 2 pixels and I 3 spectral bands, resulting from a pure HSI X ∈ R I1×I2×I3 impaired by an additive tensor noise N ∈ R I1×I2×I3 .The tensor R can be expressed as The n-mode flattened matrix R n of tensor R ∈ R I1×I2×I3 is defined as a matrix from R In×Mn , with M n = I p I q , with p, q = n.R n columns are I n -dimensional vectors obtained from R by varying the index i n and keeping the other indices fixed.These vectors are called n-mode vectors.According to (1) the n-mode flattened matrix R n can be expressed as There are several approaches to filter multidimensional data.A common one is to consider the modes of the tensor data as separable to enable classical 1D or 2D methods.However, that could lead to a loss of interdimension relationships.An interesting approach uses a hybrid filtering relying on the decorrelation of channels.
In this paper, we propose a tensor method, Tucker3 model, which permits to process the tensor data as a whole entity.

Proposed denoising method
4.1.Pre-whitening procedure If the noise in HSI is colored, the noise covariance matrix of the n-mode unfolding matrix where σ 2 is the variance of the corresponding white noise, I is an identity matrix and the superscript T denotes the transpose, consequently, MWF cannot effectively remove this type of noise and estimate the expected signal X .In this paper, we propose a method to modify the colored/non-white noise in R into a white one, then MWF can be used effectively to denoise the whitened data tensor R w .A whitening matrix P n can be applied to R. The matrix P n is given by where V n is the orthonormal n-mode matrix holding the eigenvectors and Λ n is the matrix of the corresponding eigenvalues of the matrix In the non-white noise case, we consider the unfolding matrix and substitute R n to then R w = P n X n + P n N n (8) with the assumption that the signal is independent of the non-white noise.So, the covariance matrix that is to say that the non-white/colored noise has been whitened.Thus the MWF algorithm can be applied to the whitened unfolding data matrix P n R n .To get the estimated signal X , an inverse process of whitening is necessary after we get the denoised image.

Multi-way Wiener filtering with preserving local image characteristics
The noisy HSI tensor can be expressed as : MWF filter aims at estimating the desired signal X from data tensor R using multilinear algebra tools [19] : where × n is the n-mode product, that is, the matrix product between n-mode flattening matrix R n and matrix H (n) , n = 1 to 3. Equation (11) represents the n-mode filtering of data tensor R by n-mode filters H (n) , n = 1 to 3. The optimal n-mode filter H (n) is computed by the minimization of the mean squared error e(H (1) , H (2) , The n-mode filters H (n) are obtained using an Alternating Least Squares (ALS) algorithm.
Thanks to this procedure any filter along a given mode depends on the filters along all other modes.In this iterative algorithm, the n-mode filters are initialized to corresponding identity matrices.Every m-mode filter H (m) fixed, m = n, the expression of the optimal n-mode filter H (n) is [19] : where γ with p) where m = n, p = n and ⊗ defines the Kronecker product.By assuming that X n can be expressed as a linear weighted combination of K n vectors associated with the largest eigenvalues of E[R n R T n ], the optimal n-mode filter H (n) is expressed as [19] : where, in which is estimated by computing the average of the The estimate of H (n) can be computed out of the tensor data R.
The computation of n-mode filters H (n) , n = 1 to 3 involves the n-mode rank values K 1 , K 2 and K 3 .These values are estimated using the criterion [20].
In practice the MWF provides, generally, blurry restored tensor.It does not consider local details.In order to preserve edges, it is necessary to apply the filtering on the HSI's homogeneous parts.Whereas a fixed size window may not cover homogeneous parts of the image, an adapted quadtree decomposition to HSI permits to process successively homogeneous blocks.Their sizes are linked with the local characteristics of the image.Quadtree decomposition has often been used to represent the underlying structure of digital data [21].A quadtree decomposition is based on the recursive regular decomposition of space into blocks whose sides are of size power of two.The quadtree decomposition starts from a T × T block where T is a power of two and it divides the array into quadrants if the image is not homogeneous.Each sub-block is then recursively processed like providing a decomposition in which every block is homogeneous.We adapt, in this paper, the quadtree decomposition to improve the restoration of details after noise removal.The approach consists in filtering separately homogeneous regions to preserve local characteristics.The decision function associated with the split homogeneity test relies on the variance of each block image W to measure its homogeneity [21]: where σ 2 W denotes the variance of N pix pixel intensities p k in the block W with mean value m W .The comparison of this variance to an experimental a priori fixed threshold, permits to decide whether to split or not a block into four sub-blocks.The Local Multidimensional Wiener Filtering (LMWF) method can be summarized as follows 1.For each mode: Decompose the tensor into homogeneous sub-blocks using a variance based quadtree method.2. Flatten each sub-block along the mode.3. Compute LMWF using Equation (11) 4. Compute the average filtered HSI.
In the next section we show how PW-LMWF can be used to improve the classification.It simultaneously reduces the non-white noise and spectral dimensionality which leads to improvement of classification algorithms.

Spectral dimensionality reduction
In HSI context, we are interested in reducing the number of spectral bands by selecting more significant spectral features in order to improve classification.The principles of PCA are the following: I 3 images of full size I 1 • I 2 are considered.Each image is transformed into a vector by row concatenation.Data tensor R ∈ R I1×I2×I3 composed of all I 3 images as slice matrices becomes a matrix R 3 ∈ R I3×p where p = I 1 × I 2 .The aim of DR is to extract a small number P 3 < I 3 of features, called principal components (PCs).Therefore the P 3 PCs generate a reducing matrix Z ∈ R P3×p , Where V (3) s is a matrix holding P 3 selected eigenvectors, Λ is the diagonal eigenvalue matrix holding the P 3 largest eigenvalues.The data can be reshaped as a tensor Z ∈ R I1×I2×P3 .In tensor formulation [19], the previously obtained matrix Z is equivalent to the 3-mode flattened matrix of R noted R 3 .Then Z can be written In the same way, we can turn the PW-LMWF-(K 1 , K 2 , K 3 ) into a spectral dimensionality reduction tool.This tool is referred to PW-LMWF dr -(K 1 , K 2 , P 3 ) in this paper, where P 3 represents the number of spectral principal components.PW-LMWF dr -(K 1 , K 2 , P 3 ) extracts P 3 spectral PCs in order to obtain the three-way array Z ∈ R I1×I2×P3 .The challenge is carried out thanks to the LMWF dr -(K 1 , K 2 , P 3 ) is to jointly reduce the dimensionality of the spectral mode and to project the information along the spatial modes onto lower (K 1 , K 2 )-dimensional subspaces.The latter processing permits to compress and to spatially denoise the data.
The PW-LMWF dr -(K 1 , K 2 , P 3 ) model reads where H (n) is the filter for the n-mode, defined in Equation ( 13).First, LMWF dr -(K 1 , K 2 , P 3 ) joint uses spatial and spectral information to extract the spectral principal components.Secondly, PW-LMWF dr -(K 1 , K 2 , P 3 ) denoises the extracted spectral principal components thanks to the estimated spatial projectors, H (n) n = 1, 2 and the pre-whitening procedure.

Experiments
In this section, we focus on the classification results obtained after denoising by PW-LMWF dr -(K 1 , K 2 , P 3 ) and other methods.Two real-world images are considered for this investigation.The first one, referred to as HYDICE HSI, was acquired by HYperspectral Digital Imagery Collection Experiment (HYDICE) and has 148 spectral bands ( from 435 to 2326 nm), 310 rows, and 220 columns.The scene, is shown in figure 1 (a).This HSI is modeled as a tensor R ∈ R 310×220×148 and its ground truth is shown in figure 1 (b).According to the ground truth, there are 7 land cover classes in HYDICE HSI: field, trees, road, shadow and three different targets.The second one, referred to as AVIRIS HSI was collected by the airborne visible/infrared imaging spectrometer (AVIRIS)from a mixed forest/agricultural site at the Indian Pine test site in north-west Indiana.This image was taken by the National Aeronautics and Space Administration (NASA)/Jet Propulsion Laboratory.The raw image size is 145 × 145 × 220 (I 1 = 145, I 2 = 145, I 3 = 220) (figure 1 (c)).This HSI can be represented as a tensor R ∈ R 145×145×220 and its ground truth is shown in figure 1 (d).According to the ground truth, there are 16 land cover classes in AVIRIS HSI: Corn-min, hay-windrowed, stone-steel towers, woods, wheat, soybean-clean, Oats, soybean-notill, corn, bldg-grass-tree-drives, alfalfa, corn-notill, grass/trees, grass/pasture, grass/pasture-mowed, soybeans-min.
To compare the classification results quantitatively, the overall accuracy (OA) in percentage is defined as: for P classes C i (i = 1, • • • , P), if a i,j is the number of test samples that actually belong to class C i and is classified into C j (j = 1, • • • , P), then OA is OA = 1 M P i=1 a i,i where M is the total number of samples, P is the number of classes C i and a i,i = a i,j for i = j.And to evaluate quantitatively the denoising results, the signal to noise ratio of image after denoising also named as SNR output is defined as: SNR out = 10 log 10 X 2 / X − X 2 (dB) where X means the estimated signal after denoising.Correspondingly, SNR input is defined as: SNR in = 10 log 10 X 2 / N 2 (dB).The experiment shows the ability of PW-LMWF dr -(K 1 , K 2 , P 3 ) as a DR method; and its robustness in the presence of non-white noise.

Classification of real-world HSI
The PW-LMWF dr -(K 1 , K 2 , P 3 ) is tested on real-world HSIs which are impaired by an additive noise because of some properties of imaging system.This experiment evaluates the necessity of denoising real-world data [22] to improve the classification results.For this experiment, (K 1 , K 2 , K 3 ) = (39, 39, 85) are used to apply PW-LMWF-(K 1 , K 2 , K 3 ).
Table 1 shows SVM classification results and the SNR out of the denoised images by PW-LMWF dr -(K 1 , K 2 , P 3 ), PW-MWF dr , PCA dr , PCA dr -Wiener, MNF dr and MNF dr -Wiener denoising methods.The OA of SVM classification obtained from the raw HYDICE HSI is 93.49% and 81.69% from AVIRIS HSI.Therefore, both the OA values of classification and SNR out results show that denoising and dimensionality reduction are necessary preprocessing steps.This table shows the DR usefulness.In fact, the DR permits to increase OA for each filtering method and for this HSI the PW-LMWF dr -(K 1 , K 2 , P 3 ) outperforms better PW-LMWF and other methods in denoising and preserving local characteristics of objects in HSI.We notice from the OA values that denoising is a necessary preprocessing step, but as the noise in this HYDICE HSI is not obvious in the next experiment we test the noise robustness of the proposed method for reducing more non-white noise in HSIs than in previous images.

Classification in noisy environment
In this experiment, we test the noise robustness of the PW-LMWF dr -(K 1 , K 2 , P 3 ) method, and the results highlight the advantage of applying this method on HSIs before classification.For this issue, non-white noise [15] is added to real-world HSI data with SNR in varies from 15 to 40 dB into the real-world HSI.We compare the classification results after applying our proposed method and other denoising methods.
Figure 2 shows the OA values obtained from the denoised HSI and shows that the PW-LMWF dr -(K 1 , K 2 , P 3 ) method permits to reduce jointly the spectral dimension and noise which is of great interest for SVM classifier.The comparison of the OA values calculated for each preprocessing of DR and denoising, shows that multilinear algebra-based DR method PW-LMWF dr -(K 1 , K 2 , P 3 ) leads to better classification results than PCA dr -Wiener, PW-MWF dr and MNF dr -Wiener considered in this experiment.Figure 3   Figure 4 presents OA values obtained from the denoised HYDICE HSI and AVIRIS HSI also shows that the PW-LMWF dr -(K 1 , K 2 , P 3 ) performs much better than PW-MWF and other methods particularly when the SNR in value is low.All these results demonstrate the advantage of using the proposed method, this advantage is much more significant with AVIRIS HSI where the relevant features are localized on some regions of image.From the presence of such small local features as we can expect the proposed method provide better results compared to other methods, because PW-LMWF dr -(K 1 , K 2 , P 3 ) permits to reduce simultaneously the spectral dimension and the dimensions of the spatial subspaces with preserving local image characteristics which is of great interest for SVM classifier.

Conclusion
With the advances of the electronic components, the reduction of photonic noise has become an important task in denoising HSI data collected by new-generation hyperspectral sensors.For this case, to reduce noise, a novel tensor-based algorithm, called PW-LMWF dr -(K 1 , K 2 , P 3 ), is proposed for joint noise and dimensionality reduction with preserving local image characteristics.This joint spatial-spectral processing is cross-dependent thanks to the ALS algorithm.To reduce the colored/non-white noise in HSI, a pre-whitening method is proposed based on a two-stage process (PW-LMWF) composed of a noise-whitening procedure and a LMWF filter.We focused on the ability of PW-LMWF dr -(K 1 , K 2 , P 3 ) as a preprocessing algorithm that improves SVM classification result applied to real-world AVIRIS and HYDICE data.Quantitative results based on OA criterion evaluate the impact on the spatial ranks, (K 1 , K 2 ) values, and compare the performance with selected dimensionality reduction methods.Indeed in comparison with PCA dr , PW-MWF dr and MNF dr , the PW-LMWF dr -(K 1 , K 2 , P 3 ) permits to extract spectral components by taking into account spatial information by simultaneously estimating spatial filters to denoise them.The comparison with selected hybrid filters, which perform 2D-spatial filtering of the retained spectral components, permits to appreciate the denoising efficiency of our method, in the application of classification in noisy data.From the analysis and the comparative study against other similar methods in the experiments, it can be concluded that PW-LMWF dr -(K 1 , K 2 , P 3 ) method can effectively reduce white or colored noise from HSIs.It is also necessary to take into account the colored noise when dealing with HYDICE and AVIRIS data.
These promising results encourage us to extend our experiments on the HSIs distorted by both photonic noise and spectrally correlated noise and on other hyperspectral data, for instance HSIs obtained from new generation high-resolution hyperspectral sensors.

Figure 2 :
Figure 2: Spectral dimensionality reduction outcome for classification: (a) Raw HYDICE data OA = 93.49,(b) PCA dr -Wiener OA = 95.60,(c) MNF dr -Wiener OA = 95.66,(d)PW-MWF dr OA = 96.82,(e) PW-LMWF dr OA = 99.88 .resultsobtained after denoising by different methods where 16 classes are used for classification of this AVIRIS HSI.It can be seen that the 2D Wiener filtering applied on spectral components obtained by PCA permits to improve the OA value.This experiment shows that the tensor methods can do better as a preprocessing procedure than 2D methods for the classification and PW-LMWF dr method shows its significant advantage compared to PW-MWF and 2D methods.