Rnrn LI, Shunming LI,b,*, Kun XU, Mengjie ZENG, Xinglin LI,Jinfeng GU, Yong CHEN
a School of Energy and Power Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
b School of Jincheng, Nanjing University of Aeronautics and Astronautics, Nanjing 211156, China
c Jiangsu Donghua Testing Technology Co, Jingjiang 214500, China
KEYWORDS
Abstract The effect of intelligent fault diagnosis of mechanical equipment based on data-driven is often premised on big data and class-balance.However, due to the limitation of working environment,operating conditions and equipment status,the fault data collected by mechanical equipment are often small and imbalanced with normal samples.Therefore, in order to solve the abovementioned dilemma faced by the fault diagnosis of practical mechanical equipment, an auxiliary generative mutual adversarial network (AGMAN) is proposed.Firstly, the generator combined with the auto-encoder (AE) constructs the decoder reconstruction feature loss to assist it to complete the accurate mapping between noise distribution and real data distribution, generate highquality fake samples, supplement the imbalanced dataset to improve the accuracy of small sample class-imbalanced fault diagnosis.Secondly, the discriminator introduces a structure with unshared dual discriminators.Realize the mutual adversarial between the dual discriminator by setting the scoring criteria that the dual discriminator are completely opposite to the real and fake samples,thus improving the quality and diversity of generated samples to avoid mode collapse.Finally,the auxiliary generator and the dual discriminator are updated alternately.The auxiliary generator can generate fake samples that deceive both discriminators at the same time.Meanwhile, the dual discriminator cannot give correct scores to the real and fake samples according to their respective scoring criteria, so as to achieve Nash equilibrium.Using three different test-bed datasets for verification, the experimental results show that the proposed method can explicitly generate highquality fake samples,which greatly improves the accuracy of class-unbalanced fault diagnosis under small sample,especially when it is extremely imbalanced,after using this method to supplement fake samples, the fault diagnosis accuracy of DCNN and SAE are relatively big improvements.So, the proposed method provides an effective solution for small sample class-unbalanced fault diagnosis.
Most of the mechanical equipment is powered by bearings or gears.Once the failure occurs in the operation process, the mechanical equipment will have great potential safety hazards.In light cases,the equipment is damaged,and in serious cases,the equipment is scrapped or casualties.1–2As far as aeroengines are concerned, as the ‘‘heart” of the aircraft, the high-end rotating mechanical equipment, its safety and reliability affects the overall operation state of the aircraft.Once failure occurs, it will lead to huge economic losses and huge casualties.3–4Therefore, timely, effective and reliable fault diagnosis of bearings and gears of aero-engine is a necessary means to ensure the safe and reliable operation of aircraft.5.
During the operation of aero-engine, multiple measuring points are generally required for real-time monitoring of key parts, and the monitoring experience of each measuring point is generally from the beginning of service to the end of service life of the equipment.The data collection period is wide, and massive data will be obtained in the detection process.6–7However, in such a huge amount of data, the number of normal samples collected often far exceeds the number of fault samples, resulting in an class-imbalanced of small sample, that is, the number of normal samples (majority class) is not equal to the number of fault samples(minority class)of each class.8–9While, the diagnosis effect most of the existing fault diagnosis algorithms often rely on a large number of class-balanced samples, that is, the number of normal samples is equal to the number of fault samples of each class.10–11So, if the existing fault diagnosis algorithm is directly applied to the classimbalanced of small sample for fault diagnosis, the diagnostic performance and generalization ability of the existing fault diagnosis algorithm will be greatly reduced.Therefore, in the class-imbalanced dataset of small samples, how to effectively identify minority class samples and improve the diagnostic performance and generalization ability of intelligent fault diagnosis algorithm is the top priority of practical application of fault diagnosis algorithm.
The current mainstream methods to solve the classimbalanced under small samples can be simply summarized into two categories, methods based on data augmentation and methods based on algorithm optimization.Methods based on data augmentation mainly include synthetic minority oversampling technique (SMOTE) and generative adversarial network (GAN).SMOTE randomly selects a point between each minority class sample and its nearest neighbor as a newly synthesized minority class sample.12–15For example,Zhang et al.16constructed a novel class imbalance processing technology for large-scale dataset for large-scale datasets (SGM) by flexibly combining SMOTE and under-sampling for clustering based on Gaussian Mixture Model(GMM).The results showed that the detection rate of the minority class was improved significantly by SGM technique.Yi et al.17proposed a minority clustering SMOTE (MC-SMOTE) method, which involves clustering of minority class samples to improve the performance of imbalanced classification.Verified by various benchmark datasets and real industrial datasets, MC-SMOTE has better performance than classic SMOTE.However, the data augmentation method of SMOTE cannot overcome the marginal distribution problem due to the limitation of its principle,and even confuse the classifier, resulting in an increase in its classification difficulty.Inversely, GAN is through the adversarial between the generator and the discriminator, so that the generator can use random noise to generate fake samples that are consistent with the distribution of real samples,supplementing the shortage of fault samples.18–19For instance, Guo et al.20built a multi-label 1-D generative adversarial network(ML1-D-GAN)fault diagnosis framework,which can generate data that has good applicability and greatly improves the fault diagnosis accuracy of real bearings.Liu et al.21raised an unbalanced fault diagnosis method based on an improved multi-scale residual generative adversarial network (GAN) and feature enhancement-driven capsule network.The verification experiments showed that the method can handle unbalanced fault data well and has good diagnostic accuracy.Wang et al.22proposed an enhanced generative adversarial network(E-GAN),it combined DCGAN and the k-means clustering algorithm to establish an improved CNN diagnostic model for fault classification.The experimental results showed that this method can significantly improve the classification accuracy.Although GAN successfully overcomes the SMOTE edge distribution problem, when the training data is insufficient, its own structural limitations lead to poor stability of the dynamic adversarial process or even mode collapse, the quality and diversity of the generated samples are difficult to guarantee, and the diagnostic performance of model may not be improved.
Methods based on algorithm optimization directly starts from the class-imbalanced dataset, increasing the ability of the classifier to focus on the minority class by optimizing the structure of the algorithm and the loss function,thereby reducing the misjudgment probability of the minority class and improving the diagnostic accuracy of the model.Such as,Qian et al.23came up with a balanced sparse filtering(BSF)for feature extraction, which solved class-imbalanced question both in feature extraction and classification.Xu et al.24presented a neural network called discrete-peak joint attention enhancement (DPJAE) convolutional model that successfully solved the problem of transfer fault diagnosis under unbalanced samples.Peng et al.25developed a new form of the bidirectional gated recurrent unit(BGRU)to support effective and efficient fault diagnosis using cost-sensitive active learning.He et al.26put forward a novel tensor classifier called support tensor machine with dynamic penalty factors(DC-STM)and applied it to fault diagnosis of unbalanced rotating machinery, which achieved better classification results when the training set is an unbalanced dataset.Although the above fault diagnosis algorithms have achieved good diagnosis results for class imbalances,they all have a big defect,that is,the fault diagnosis algorithms are designed according to specific imbalance ratio scenarios.Once the preset imbalance ratio changes greatly, the diagnostic ability of the fault diagnosis algorithm will be greatly reduced and even be lapsed.Hence,the generalization ability of the method based on algorithm optimization is insufficient,and it cannot flexibly apply to the actual conditions with variable imbalance ratio.
In summary, to fundamentally solve the problem of classimbalanced under small samples and meet the practical application requirements of fault diagnosis algorithms, the method based on data augmentation is undoubtedly the most effective way.For sake of making up for the deficiency of the above mentioned data augmentation methods for class-imbalanced fault diagnosis, this paper proposed an AGMAN model.The proposed method utilized the powerful feature reconstruction ability of the AE and the dual discriminator mutual opposition scoring criteria to design the generator and the discriminator,which improving the quality and diversity of generated samples, avoiding mode collapse and ensuring training stability.The dual discriminator aimed to solve mode collapse through mutual adversarial modes while acting on the generator, promoting the generator to map fake samples that are similar to the real samples distribution.The training of the auxiliary generator is guided by the decoding reconstruction feature loss and the dual discriminator fake sample score, rather than the general discriminator fake sample classification probability.Both constraints doubly guarantee the quality of the samples generated by the generator.The contributions of this paper are as follows:
(1) Introduce the AE and the multi-layer fully connected layer network to form the auxiliary generator, and use the decoding reconstruction feature loss to assist the generator to complete the accurate mapping relationship between the noise distribution and the real data distribution, improve the stability of the model training process,and generate high quality fake samples.
(2) Meanwhile, Integrate the unshared dual discriminator structure, and combine the auxiliary generator to form the AGMAN model.The three-player game between the generator and the dual discriminator avoids the mode collapse of the traditional GAN, and improves the quality and diversity of the generated samples.
(3) Experiments are performed on datasets from three different test-beds.Quantitative evaluation indexes and qualitative visualization analyses both demonstrate that the proposed method can generate high-quality fake samples and expand the original class-imbalanced dataset.Compared with directly using a small sample class-imbalanced dataset as a training set for fault diagnosis,the proposed method has a substantial improvement in the effect of fault diagnosis,with high diagnostic accuracy, good stability and strong generalization.
The rest of this article is organized as follows: The second part mainly introduces the theoretical background of the proposed method from two aspects of stacked auto-encoder(SAE) and dual discriminator generative adversarial network.The third part mainly introduces the proposed method in detail,including the motivation,the basic framework,training steps and model evaluation.The fourth part mainly conducts the experimental verification of the proposed method, which is described from three aspects: datasets description, sample generation and quality evaluation, and class-imbalanced fault diagnosis.The last part is a summary of the whole paper,summarizes the advantages of the proposed method and the effect of class-imbalanced fault diagnosis under small samples, and prospects the defect of the proposed method.
The AGMAN method proposed in this paper is based on a stacked auto-encoder network and dual discriminator generative adversarial network.Therefore, before the detailed introduction of this method, the above two parts are briefly introduced.
The AE is a simple unsupervised neural network model.27Due to its unique structural design, its biggest function is to reproduce the input signalX, that is, the output ^X is equal to the input X and the basic composition is shown in Fig.1.It is composed of encoder and decoder.The encoder is responsible for feature extraction of input signalX, the encoding function is denoted as f(X), and the output featureY.The decoder is responsible for the reconstruction about featureY, the decoding function is denoted as g(Y), and output the reconstructed signal^X.In addition, regularization items can also be added according to requirements to make the model more sparse,simplify the effect of the model and prevent the model from over-fitting.The reconstruction loss is
where the N is number of samples,μ is the regularization item weight parameter, and ω is weight parameter.
Due to its structural limitations, AE cannot extract deep features that can variously express input signals, causing that it unable perform more complex classification tasks.To overcome the drawbacks of AE, SAE came into being.28SAE stacks multiple AE while retaining the advantages of AE.When multiple AE is stacked, the encoder composed of the input layer and multiple hidden layers can mine the deep features of the original input signalX,while the decoder composed of multiple hidden layers and output layers can better realize the signal reconstruction.The SAE structure is shown in Fig.2.
Fig.1 Basic structure of AE.
Fig.2 Basic structure of SAE.
As an excellent generative model, GAN has the basic composition shown in Fig.3.Its biggest advantage is that it breaks the drawbacks that traditional generative models need to rely on prior assumptions with the help of the basic idea of mutual game theory.Furthermore, it enables the generator to complete the mapping from random noise sample distribution to real sample distribution through adversarial training, that is,the samples generated by the generator obey the real sample distribution.
In the GAN,adversarial training is done by both generator and discriminator.The goal of the discriminator is to accurately distinguish between real samples and generated fake samples, while the goal of the generator is to generate fake samples from which the discriminator cannot distinguish the source.The generator and the discriminator are continuously trained alternately according to their respective goals, and finally the GAN reaches the Nash equilibrium,that is,the generator generates samples with the same distribution as the real samples, and the discriminator cannot correctly judge the source of the samples.The overall optimization objective function of GAN is
where X represents the true sample, G represents generator,G(Z) represents the generated sample, Z represents the random noise,D represents discriminator,D(X)and D(G(Z))represent the discriminant probability of the discriminator to the true sample and the generated sample respectively,EX~Pdatarepresents the expectation of the real sample distribution, and EZ~PZrepresents the expectation of the generated sample distribution.
Although the adversarial mode of GAN has obvious advantages, it also has fatal disadvantages.The adversarial training of GAN is a dynamic process, if no additional restrictions are imposed, the training process will be unstable and even mode collapse.For the sake of solving the mode collapse problem of GAN, on the basis of GAN, a dual discriminator generative adversarial network (D2GAN) is formed by adding a discriminator that is completely opposite to the optimization objective by the original discriminator.29D2GAN is as shown in Fig.4.When the input sample is true,discriminator 1 rewards a high score,whereas discriminator 2 gives a low score.When the input sample is a generated sample, discriminator 2 rewards a high score, whereas discriminator 1 gives a low score.Unlike GAN,the scores returned by our discriminators are values in R+rather than probabilities in [0, 1].The generator tries to generate fake samples that confuse both discriminators as much as possible.This mutual adversarial mode can complement the insufficiency of a single discriminator,generate diverse samples, and avoid mode collapse.The D2GAN objective function is as follows
where D1(X)and D2(X)respectively represent the scores of the discriminator 1 and discriminator 2 on the real samples,D1(G(Z)) and D2(G(Z)) respectively represent the scores of the discriminator 1 and discriminator 2 on the fake samples generated by the generator, α and β are hyper-parameters.The main functions are as follows:one is to stabilize the model learning process, and the other is to control the influence of KL and inverse KL divergence on optimization.
The principle of AE can be simply summarized as feature extraction of the original input through the encoder, and then reconstruction of the extracted features through the decoder to restore samples similar to the original input distribution.The basic principle of D2GAN can be simply summarized as that the original real samples and noise are input into the generator at the same time.Through the continuous confrontation between the generator and the dual discriminator, the noise is gradually reconstructed by the generator using the real sample as a template into a fake sample with a similar distribution to the real sample.
Fig.4 Basic structure of D2GAN.
From the principles of both, it can be seen that AE and D2GAN have in common that they can reconstruct samples that are close to the original real sample distribution.The difference is that D2GAN generates fake samples by adversarial mechanism, and AE restores original real samples by selfdecoding.The adversarial mechanism is a kind of mutual incentive mechanism, which makes the model have the ability of continuous learning but at the same time leads to the instability of the model training process.The AE belongs to selfreconstruction, although there is no continuous learning ability but the training process is relatively stable.Therefore, in order to make model train stability and retain the ability of continuous learning of the confrontation mechanism, we perfectly combined the AE which has similar functions but relatively stable training with D2GAN to build a novel model called AGMAN.
The method proposed in this paper is mainly to solve the problem that the effect of fault diagnosis is not ideal and the generalization is insufficient when dealing with class-imbalanced small sample in real working conditions.The whole method mainly includes three parts: data acquisition and preprocessing,data generation,and fault diagnosis.The basic framework flow is shown in Fig.5.
Step1: data acquisition and preprocessing.The vibration acceleration sensor and the data acquisition system are used to collect the time series vibration acceleration signals of the rolling bearing in various health states,and four different types of original real samples are obtained, including one type of normal samples and three types of fault samples.
Then, the collected raw signals in each type of health state are divided into subsequences with a fixed length of 1200 by sliding pane, and 1000 samples are obtained in each type of health state.
Moreover, the features of the collected time domain signal unconspicuous, which may limit the effectiveness of data generation and fault diagnosis.While the frequency domain signal contains more obvious features,so the original samples need to be fast Fourier transform (FFT) before input into the model.
Finally, In order to avoid large deviations between the frequency domain samples, the Normalizer is used after FFT, so that the values of each dimension of each frequency domain sample are kept at the range of [0, 1].
Step2: data generation.Establish data generation model of AGMAN.The generator is mainly embedded with the AE to build auxiliary generator, and the discriminator adopts the dual discriminator structure, combining the new generator and the discriminator, and the AGMAN model is established.
The auxiliary generator(AG)of AGMAN model is mainly composed of AE and generator.The input layer of the encoder part of the AE includes 600 neurons, the first hidden layer of the encoder includes 512 neurons,and the feature output layer includes 256 neurons.The decoder input layer includes 256 neurons,the first hidden layer of the decoder includes 512 neurons, and the decoder reconstructed output layer includes 600 neurons.The real sample is input into the AE part of the AG to complete the feature extraction and feature reduction of the real sample.In addition, The input layer of the generator part includes 100 neurons, the first fully connected layer includes 128 neurons, the second fully connected layer includes 256 neurons, and the third fully connected layer includes 512 neurons, which is shared with the decoder first hidden layer, and the adversarial output layer includes 600 neurons, which is shared with the decoder reconstruction output layer.The activation functions of the auxiliary generators all use Sigmoid.The random noise is fed into the generator part of the AG,and the random noise is mapped as a fake sample through the generator, which have a similar distribution with the real sample.The formula is as follows
where y represents the input of activation function.
The dual discriminator has the same structure, but do not share weights.Input layer of each discriminator includes 600 neurons, the first fully connected layer includes 512 neurons,the second fully connected layer includes 256 neurons, and the score judgment output layer includes 1 neuron.Among them, the activation function of output layer uses Softplus,and the rest use Relu.The mathematical expressions are as follows
Step3:fault diagnosis.Supplement the generated samples to the original imbalanced training set as required, and then use the class-balanced dataset as a new training set to train the fault diagnosis model.Simultaneously,the real sample dataset is used as the test set to verify the fault diagnosis ability of the model.The general fault diagnosis model chooses classical methods: deep convolutional neural network (DCNN) or SAE, whose application is more mature.
The data generation model of AGMAN is used to learn the data distribution of the real samples and generate fake samples that are similar to the distribution of the real samples.The optimization objectives of the auxiliary generator and the dual discriminator are respectively as shown in the following formulas
where K represents fault type,(XK)featurerepresents the features obtained by decoding real sample XKin the first hidden layer of the decoder, GK(Z)featurerepresents the features obtained by extracting fake sample GK(Z) in the third fully connected layer of the auxiliary generator, λ is auxiliary factor hyperparameter, the purpose is to assist generator training with auto-encoder,α ,β and λ are set to 0.1, 0.1, 0.2 respectively.The detailed training steps of AGMAN are as follows.
Step1: Fixing the dual discriminator, and minimizing optimization objective LGto train auxiliary generator,which completing the accurate mapping of random noise distribution to the distribution of real samples that generates fake samples with the same distribution as real samples.
Fig.5 Basic framework of proposed method.
A small number of samples with fault type K are selected as the input of the AG encoder and the mapping template of the generated samples, denoted as real samplesXK.At the same time, the random noise with the same number of real samples as the fault type K and the dimension of 100 is selected as the input of the AG, denoted as random noise Z.The AG generates fake samples with fault type K, denoted asGK(Z).
Step2: Fixing the AG, The generated fake sample GK(Z)and the original real sample XKare input to the dual discriminator at the same time, and the dual discriminator is trained according to the minimization optimization objectiveLD.The two discriminators score the fake samples and the real samples according to the scoring criteria set in the previous stage respectively.
Step3:Alternately training the AG and the dual discriminator (Step1 and Step2) until the Nash equilibrium is reached,the AG can generate fake samples that confuse the dual discrimination at the same time,the training of the model is completed.Then,the generated K-type fake samples of are used to balance the number of same type real samples and normal samples.
When the model training is stable, the generated samples with fault type K are automatically saved until the end of training.The generated K-type samples are supplemented to the real samples corresponding to their categories according to the imbalanced ratio, so as to balance the number of K-type samples and normal samples.
Step4: Inputting real samples of other fault types respectively,repeating the above steps,the auxiliary generator generates fake samples corresponding to categories of real samples,and supplementing fake samples to imbalanced real samples according to requirements, and all types of samples reach class-balance.
The AGMAN method is to generate fake samples consistent with the distribution of real samples,supplement the fake samples to make the training samples balance, thereby solving the problem of poor diagnosis results and generalization ability when class-imbalanced under small samples in actual working conditions.Hence, in order to ensure that the fault diagnosis model trained after the generated sample is mixed with the original sample can improve the fault diagnosis effect in the real test set, the quality of the generated sample needs to be evaluated.
Currently,the methods commonly used for vibration signal quality evaluation mainly include Pearson Correlation coefficient (PCC), Cosine similarity (CS) and Euclidean distance(ED).The similarity between the generated sample and the real sample is quantitatively measured from the perspective of statistics to prove the data generation ability of the proposed method.
ED measures the difference between two n-dimensional vectors by directly calculating the straight-line distance between them in the vector space, and the result range is[0,+∞).The smaller the distance, the more similar the two vectors are.Its distance function is expressed as follows
where n represents the dimension of a vector, ajand bjrepresent the values of two n-dimensional vectors in the j-th dimension.
CS measures the difference between two vectors by calculating the angle between two n-dimensional vectors in the vector space, and the calculation results range is[-1,+1].The larger the value is, the smaller the angle between the two vectors is,and the higher the similarity between the two vectors is.The formula can be expressed as follows
PCC is another way of measuring the similarity of two ndimensional vectors in a vector space.It is mainly based on ED, which centralizes the value of the vector, and then calculates the CS of the centralization result to make up for the disadvantage that the null value cannot be calculated by using the CS,and the calculation results range is also[-1,+1].The larger the value is, the stronger the correlation between the two vectors is.The mathematical expression is as follows
Gearbox dataset of Southeast University,published by Southeast University, it mainly includes two types of data, bearing and gear, which are collected on drivetrain dynamic simulator(DDS).30The test bench is shown in the Fig.6.In this experiment, the bearing data set with the speed-load configuration set to 30 Hz-2 V and channel 4 was selected for experimental verification(denoted as Dataset A).There are four main health states: normal, fault in roller, fault in inner race and fault in outer race, which are denoted as N, RF, IF, and OF in turn.
Bearing dataset of private test bench, collected by the customized test bench of our research group, the bearing model is NU205EM, and the rated power of the motor is 0.75 kW.The test bench is shown in the Fig.7.The dataset collected when the rotation speed is 1300 r/min is selected as the experimental verification data (denoted as Dataset B).This data set also contains four health states of normal, fault in roller,fault in inner race and fault in outer race, which are also marked as N, RF, IF, OF.
Fig.6 Gearbox dataset of Southeast University test bench.
Fig.7 Bearing dataset of private test bench.
Data preprocessing, for the above two different datasets,the data are randomly divided into datasets with a sample size of 1000 and a sample length of 1200 by sliding panes in each health state, that is, the size of the obtained dataset in each health state is [1000, 1200].Then the time-domain data are transformed into frequent-domain data by FFT, and the size of the dataset in each health state is [1000, 600].Afterwards,the frequent-domain data are normalized, and each sample is scaled to the range of[0,1].Finally,400 samples are randomly selected as the new dataset,then the new dataset is divided into real training set and real test set in a 1:1 ratio, so the size of each dataset of real training set and real test set is all[200,600].The specific descriptions of the bearings in the two different datasets are shown in Table 1.
The premise of using data augmentation to expand imbalanced datasets is to generate high-quality fake samples.Then, the quality of the generated data must be evaluated before fault diagnosis, so as to ensure the diagnostic effect of subsequent class-imbalanced fault diagnosis.Therefore, for each type of fault sample, 1000 fake samples corresponding to each type are generated according to the original input sample.The average results of various similarity evaluation indexes are shown in Table 2.It can be seen from Table 2 that our proposed method outperforms other methods in various similarity evaluation indexes on two different datasets.
At the same time,in order to illustrate the advantages of the proposed method more clearly,taking PCC as an example,on two different datasets, for each fault type, we draw PCC boxplots of 1000 generated samples and the original input sample,as shown in Fig.8 and Fig.9.It can be seen from Fig.8 that for each fault type, the proposed method has a larger median,smaller dispersion and higher PCC value than the other two methods.In addition, in Fig.9, the proposed method is also superior to more than GAN and D2GAN in terms of median,stability and PCC value under the vast majority of fault types.So,the fake samples generated by the AGMAN have high similarity with the real samples and strong stability, which can provide a strong guarantee for the class-imbalanced fault diagnosis.
Besides, in addition to the above quantitative statistical indicators,to more intuitively demonstrate the sample generation capability of the proposed method, a group of corresponding real samples and generated samples were randomly selected for each fault type on two different datasets for qualitative visualization, spectrums are as shown in Fig.10 and Fig.11.In Fig.10 and Fig.11,blue represents the real sample,and red represents the generated fake sample.By the analysis of two Figs,it can be seen that the generated sample has a high similarity and good spectral following with the corresponding real sample.So,through qualitative visualization analysis,it is verified again that the proposed method can generate highquality fake samples.
To verify that the fake samples generated by the proposed method can effectively deal with the class-imbalanced problem of small samples,four imbalance cases are set respectively,andthe imbalance ratios are set as 1:100, 1:50, 1: 20, 1:10.The imbalance ratio refers to the ratio of the number of fault samples of each type to the number of normal samples.In addition, to further illustrate the performance of the proposed method in fault diagnosis, four different training sets were set up for comparison, as follows: training set (imbalance),training set(expand-balance),training set(all fake),and training set (all real).The training set (imbalance) refers to the class-imbalanced original real sample, under each imbalance ratio, the number of normal samples is 200, and the number of each type of fault samples is calculated by the imbalance ratio.The training set (expand-balance) means that on the basis of the class-imbalanced original real samples, supplementing the fake samples that generated by proposed method to make class-imbalanced original real samples achieve classbalance, so the number of samples in each health state is 200.The training set(all fake)represents all fake samples generated by the proposed method except normal samples, and the number of samples in each health state is also 200; The training set (all real) means that all real samples collected are used, and the number of samples in each health state is also 200.The testing set is other randomly selected real samples in addition to the samples selected in the training set, and the number of samples in each health state is also 200.Details are shown in Table 3.
Table 1 Specific descriptions of Southeast University dataset and private dataset.
Table 2 Similarity evaluation indexes on two different datasets.
Fig.8 PCC of Southeast University dataset.
Fig.9 PCC of private dataset.
Fig.10 Spectrum diagram of real sample and fake sample in Southeast University dataset.
Fig.11 Spectrum diagram of real sample and fake sample in private dataset.
Classical intelligent diagnosis algorithms DCNN and SAE are selected for classification verification, due to the principles of these two classical algorithms are quite different,which can avoid the risk that the data only fits well with one algorithm,thus verify the generalization ability of the proposed method.On two different data sets, the DCNN and SAE models were trained for multiple times using the above various training sets,and the testing sets were used for verification.The test accuracy of various data augmentation methods on two different datasets is shown in Table 4.
According to the detailed analysis of Table 4,the proposed method has the best diagnostic accuracy compared with other methods, especially when there are few fault samples in the original imbalanced dataset.For example,when the imbalance ratio is 1:100 and DCNN is used as the diagnostic model on dataset A, compared with the expand-balance dataset using the other methods, the accuracy was improved by 2.42 %-24.79 % when using the proposed method, and the diagnostic accuracy is 70.42 % higher than that of the original imbalanced dataset.When using SAE as the diagnostic model on dataset B, the accuracy of proposed method was improved by 2.62 %-38.75 % compared with other methods, and its diagnostic accuracy is 42.69%higher than that of the original imbalanced dataset.In addition, when using DCNN as the diagnostic model on dataset A, the maximum difference in diagnostic accuracy between the expand-balance data and all the real data is no more than 2.8%,and the average diagnostic accuracy is roughly maintained at about 98.42 %.The maximum difference on dataset B is no more than 3.06 %, and the average diagnostic accuracy is about 97.84 %.Finally,there is a small difference in test accuracy between expandbalance data and the all fake data, especially in the imbalance ratio is 1:100, the test accuracy are approximately the same.
In order to show the diagnostic effect of the proposed method more intuitively, we have respectively drawn the test accuracy histogram of the proposed method under various training datasets on two different datasets, as shown in Fig.12 and Fig.13.It can be seen intuitively from Fig.12 and Fig.13, no matter whether DCNN or SAE is used asthe diagnostic model, after the proposed method is used to supplement the imbalanced dataset,the test accuracy has been substantially improved compared with the original imbalanced data, there is little difference between the test accuracy and that of using all real dataset.To sum up,the proposed method can deal with the problem of small sample class imbalance well, greatly improve the small sample class imbalance fault diagnosis accuracy, and the diagnosis effect is better.
Table 3 Details of training set and testing set.
Table 4 Various methods average testing accuracy in various imbalance ratios of two different dataset.
Fig.12 Average testing accuracy of proposed method in various imbalance ratios of Southeast University dataset.
Fig.13 Average testing accuracy of proposed method in various imbalance ratios of private dataset.
In addition, in order to further prove that the proposed method can stably solve the problem of small sample class imbalance,taking the minimum number of fault type samples,that is the imbalance ratio of 1:100 as an example,on two different datasets, eight consecutive times test accuracy radar maps using DCNN and SAE under four different conditions were drawn respectively, as shown in Fig.14 and Fig.15.It can be seen from Fig.14 and Fig.15 that the direct use of unbalanced data for fault diagnosis has the lowest diagnostic accuracy, and the shape of the radar map is irregular, moreover, the diagnostic accuracy fluctuates relatively large.However, after using the proposed method to supplement the dataset (Expand-balance), the shape of the radar map is regular and the saturation is high,the diagnostic effect is relatively stable, the diagnostic accuracy is greatly improved, and the diagnostic accuracy are roughly maintained at about 97 %and 95 % in DCNN and SAE.At the same time, using the all fake data, the degree of saturation and shape regularity of the radar map is basically the same as expand-balance data,and the fault diagnosis accuracy and stability are better.Besides, the degree of saturation and shape regularity of the radar map in the expand-balance data and the all fake data have little difference compared with the all real data.In summary, it is proved once again that the proposed method can solve the problem of small sample class imbalance with high stability and high precision.
Finally, in order to more intuitively show the classification effect of the diagnostic model after supplementing the fake data with the proposed method,taking DCNN as an example,on two different datasets, confusion matrices of using the imbalanced dataset directly and supplementing the fake data with the proposed method were plotted respectively on an imbalance ratio of 1:100, as shown in Fig.16 and Fig.17.From the analysis in Fig.16 and Fig.17, it can be seen that on the two different datasets,only the normal samples are correctly classified by directly using the imbalanced dataset, and almost all the remaining fault samples are wrongly classified as normal samples.Whereas,after using the proposed method to supplement the data to achieve class balance,the number of misclassified samples of each fault type is greatly reduced,and only a few samples are misclassified.The detailed analysis is as follows:
Fig.14 Radar map of 8 times test accuracy under imbalance ratio 1:100 in Southeast University dataset.
Fig.15 Radar map of 8 times test accuracy under imbalance ratio 1:100 in private dataset.
Fig.16 Results of confusion matrix of imbalance ratio 1:100 in Southeast University dataset.
Fig.17 Results of confusion matrix of imbalance ratio 1:100 in private dataset.
As shown in the right figure of Fig.16(a), for C1, only 5 samples are misclassified as other types, of which 1 is misclassified as C0, and 4 are misclassified as C2.For C3, 7 samples are misclassified as C1 and 7 samples are misclassified as C2.The rest of the samples are correctly classified.Therefore, in Southeast University dataset, the fault diagnosis accuracy of C1, C2 and C3 is improved by 97 %, 99.5 % and 89.5 %respectively by using the proposed method.Furthermore, the same method is used to analyze and calculate the two confusion matrices in private dataset.After supplementing the data with the proposed method, the diagnostic accuracy of C1, C2,and C3 fault diagnosis is improved by 91 %, 96.5 %, and 98.5 % respectively.Hence, according to the above comparative analysis results, it is again proved that the proposed method can solve the small sample class imbalance problem well, and the diagnosis accuracy is high.
Finally, to verify the generalization of the proposed method and further expand the application scope of the proposed method in fault diagnosis of rotating machinery equipment,we use the gear data set of the additional private test bench to test and verify the proposed method.The experimental platform is shown in Fig.18.The test bench mainly includes motor, coupling, double disc rotor, loader, bearing seat, planetary gearbox and brake.The acceleration sensor is placed on the bearing seat or the gearbox frame, and the sampling frequency is set to 25.6 kHz.In this experiment, the acceleration data collected by the gearbox brake when the input load is 0–0.2 A and the speed is 1000 r/min.Gear dataset is mainly divided into normal condition (NC), planet gear fracture(PF), planet gear pitting (PP), planet gear wear (Planet gear Wear, PW) four types of health condition.The test results are shown in Table 5.
Fig.18 Gear dataset of additional private test bench.
It can be seen from the analysis in Table 5 that on the gear dataset,the diagnosis accuracy of both DCNN and SAE fault diagnosis methods is relatively low on the original classimbalanced dataset, especially when there are few fault data.For example,when the imbalanced rate is 1:100,the fault diagnosis accuracy of DCNN and SAE is only 25.38 % and 66.12 % respectively.When the proposed method is used to supplement the original imbalanced dataset to achieve balance,on two different diagnosis methods, the fault diagnosis accuracy is greatly improved.The improvement is most obvious when the imbalanced rate is 1:100, which are increased by 67 % and 25.88 % respectively.Its fault diagnosis accuracy reaches 92.38%and 92%respectively.In addition,after using the proposed method to supplement the imbalanced gear dataset, the difference between the fault diagnosis results of theproposed method and the real balanced dataset is small.The maximum difference between the two different diagnosis methods is not more than 8 %.Therefore, the proposed method also has a good diagnosis effect for small sample imbalanced gear dataset, which further proves that the proposed method has a good generalization and can be better applied to fault diagnosis of other parts of rotating machinery equipment.
Table 5 Average testing accuracy in various imbalance ratios of gear dataset.
In this paper, an auxiliary generative mutual adversarial network(AGMAN)is proposed for the small sample class imbalance that often occurs in working condition of practical mechanical equipment.The main innovation is to flexibly combine the advantages of the auto-encoder network and the dual discriminator generative adversarial network to make the generator stably achieve the goal of generating high-quality fake samples.AGMAN constructs a new generator by embedding auto-encoder network into the generator, and assists the generator to complete the accurate mapping between the noise distribution and the real data distribution by minimizing the loss of reconstructed features.In addition, the mutual adversarial dual discriminator is adopted as the new discriminator, which avoids the mode collapse problem of the traditional generative adversarial network model.Finally, the generator and the discriminator are updated alternately according to their respective final optimization goals, and the generator can eventually generate fake samples that are highly similar to the real samples.The quality evaluation and fault diagnosis results show that the proposed method can generate highquality fake samples, and the diagnosis accuracy and stability of the fault diagnosis model are greatly improved by using the fake samples to supplement the original small samples classimbalanced data set,which is almost close to the level of using the real dataset.
Although this method can effectively solve the problem of small sample class imbalance,the two-step strategy(generating data - fault diagnosis) to solve the imbalance problem is cumbersome and has high time redundancy.How to flexibly supplement data and complete fault diagnosis according to the class- imbalanced situation at the same time is the difficulty for further research, and also the priority of putting the proposed method into practical application.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This study was co-supported by the Special Project of the National Key Research and Development Program of China(No.2020YFB1709801),the Postgraduate Research and Practice Innovation Program of Jiangsu Province (No.KYCX21_0230), the National Natural Science Foundation of China (No.51975276), the National Science and Technology Major Project (No.2017-IV-0008-0045).
CHINESE JOURNAL OF AERONAUTICS2023年9期