Transactions of Nanjing University of Aeronautics & Astronautics
网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

Identifying Anomaly Aircraft Trajectories in Terminal Areas Based on Deep Auto‑encoder and Its Application in Trajectory Clustering  PDF

  • DONG Xinfang
  • LIU Jixin
  • ZHANG Weining
  • ZHANG Minghua
  • JIANG Hao
College of Civil Aviation/College of Flight, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, P.R. China

CLC: TN925

Updated:2020-09-10

DOI:http://dx.doi.org/10.16356/j.1005-1120.2020.04.008

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
OUTLINE

Abstract

Anomalous trajectory detection and traffic flow classification for complicated airspace are of vital importance to safety and efficiency analysis. Some researchers employed density-based unsupervised machine learning method to exploit these trajectories related to air traffic control (ATC) actions. However, the quality of position data and the tiny density difference between traffic flows in the terminal area make it particularly challenging. To alleviate these two challenges, this paper proposes a novel framework which combines robust deep auto-encoder (RDAE) model and density peak (DP) clustering algorithm. Specifically, the RDAE model is utilized to reconstruct denoising trajectory and identify anomaly trajectories in the terminal area by two different regularizations. Then, the nonlinear components captured by the encoder of RDAE are input in the DP algorithm to classify the global traffic flows. An experiment on a terminal airspace at Guangzhou Baiyun Airport (ZGGG) with anomaly label shows that the proposed combination can automatically capture non-conventional spatiotemporal traffic patterns in the aircraft movement. The superiority of RDAE and combination are also demonstrated by visualizing and quantitatively evaluating the experimental results.

0 Introduction

The foreseen air traffic demand[

1] of sustainable and limited capacity will impose new challenges on efficiency, bringing security risks to the already congested terminal areas, especially busy airports like Beijing and Guangzhou. The Civil Aviation Administration of China is planning to improve the current air traffic management system in an intelligent way[2]. Particularly, the rise of machine learning technology and accessible data from the automatic dependent surveillance broadcast system (ADS-B) can accelerate the improvement by data extraction of operational anomalies and air traffic control (ATC) action patterns in terminal maneuvering areas (TMAs).

Currently, several tactical initiatives on aircraft route focus on maintaining TMA situations, such as circumnavigation, holding, and direct fly. These tactical ATC actions can associate to the recorded ADS-B data. The actual trajectories that are composed of spatiotemporal points have different characteristics and meanings[

3]. In detail, trajectories with rare occurrence or a certain degree of anomaly can present significant events in TMAs. Anomalous trajectory data can be regarded as valuable analysis materials for airspace safety. And trajectories that are already patterned but different from standard routes correspond to controllers’ preference of frequent tactical ATC actions on the standard procedure[4]. The extraction of non-conventional patterns can enhance the accuracy of the controller workload assessment, design of flight procedure and calculation of airspace complexity. Both two types of traffic flows have respective data features different from standard routes, so that unsupervised machine learning algorithms are able to classify them automatically.

In previous literature, anomaly detection based on clustering (ADCluster) identified nominal trajectories and regarded the rest not belonging to any cluster as anomalous flights. There were two main methods in the research of ADCluster. The first focused on clustering through statistical or priori features, such as the local average velocity of the trajectory, the density of the aircraft, and the distance between the aircraft and the reference point[

5]. Eckstein[6] decomposed the trajectory into various segments, which was defined as the spatial measurement of significant changes in the trajectory position or direction, and then used clustering results to monitor the anomalous behaviors of the aircraft. Gariel et al.[7] summarized and improved the first methods. First, the spatial turning points were clustered to obtain discrete sequence. Second, the main traffic flow was identified by clustering the sequence contained spatial variation. Since these methods do not directly use trajectory data, they may have limited application in the development of real-time trajectory based operations (TBO) tools in the future.

Gariel proposed the second widely used method for clustering trajectories at the level of position measurement. It consists of two well-differentiated steps. Firstly, principal component analysis (PCA) was used to eliminate redundant information on the resampling data. Secondly, density-based clustering algorithm (DBSCAN) identified anomalous trajectories and clustered them at the linear dimensionality reduced vector. The position-based data can be utilized to classify traffic flows in real time[

8], which is the principle behind air traffic flow modeling[9]. Some researchers used position measurement and the differentiated method to identify the nominal trajectory pairs in en-routes or terminal areas[1012].

In spite of the good quality results, limitations in ADCluster with the position measurement have been pointed out in studies[

1314]: The poor sensitivity to short duration anomalies. These researchers utilized recurrent neural networks (RNNs) and vector auto-regressive (VAR) method to model the surveillance data and identify the anomaly trajectories. Alternatively, Olive et al.[15] utilized auto-encoder networks, a kind of deep learning which has been proved successful at anomaly detection, to identify anomalous trajectories in en-route. However, aircraft trajectories in terminal areas have shorter duration anomalies than those in en-routes. Without the enhancement in anomaly detection, the standard DAE model cannot carry out this work fairly well. And the noise in ADS-B data will further impair anomaly in trajectories. Thus, this paper introduces two regularization methods based on robust deep auto-encoder (RDAE) model to smooth the data noise and detect anomaly trajectories.

Moreover, other research direction of the trajectory clustering has not been neglected. A state-of-the-art application by trajectory clustering is extracting the non-conventional pattern trajectories from the actual aircraft operations. Conde et al.[

16] separated the weather-related non-conventional patterns from all non-conforming behaviors in en-routes, and then provided a reference for airspace resource management. However, Andrienko et al.[17] found classical DBSCAN algorithm cannot simultaneously extract nominal and non-conventional trajectories in en-routes with only one set of parameters. Theoretically, hierarchical DBSCAN-based algorithms can better cluster dataset with different densities than classical DBSCAN. But in fact, Gallego et al.[4] found that quantitative cluster values went down though hierarchical DBSCAN-based algorithms when hyper-parameters were finely tuned to enable discrimination between small clusters. He attributed this phenomenon to special nature of the data samples being a result of the PCA application to actual flight trajectories. The dimensionality reduction method was not improved. On the contrary, a parameter adjustment model has been established to customize the extraction of clusters with RDBSCAN.

Note that, there are two limitations in the trajectory clustering for the non-conventional traffic patterns: (1) Using the PCA method to reduce position data dimension will damage the necessary details. (2) DBSCAN-based methods are sensitive to the hyper-parameters when clustering the non-conventional traffic patterns. We further utilize the encoder of trained RDAE model as substitute for PCA method due to the preferable dimensionality reduction ability to extract key information. Then, a DP clustering algorithm is combining with RDAE encoder to classify the nominal trajectories and non-conventional tactical traffic flows simultaneously. The evaluation results show that the proposed framework of the combination of RDAE model and DP algorithm outperforms baseline models.

1 ADS‑B DATA

The aircraft trajectory dataset used in this paper is denoted as:τ1,,τNtraj. Each trajectory consists of a set of time and position ordered data, i.e. τ={(ti,x̂i,ŷi,ẑi)}(i=1,,Npoint), where the time is tiZ+ and the measurements are x̂i,ŷi,ẑiR.

1.1 Dataset and separation

The ADS-B data contain the following information: timestamp, flight number, departure airport, target airport, current location (using the longitude and latitude of WGS-84 coordinates), sea level altitude (m), heading, ground speed, and vertical speed. The experiment trajectory data come from the TMA of Guangzhou Baiyun Airport, and it is separated from the national flights data by the following methods.

First, the essence of the deep learning model is to construct a main traffic flow model in the terminal area through trajectory data. The actual trajectory timestamp cannot form traffic flows, so we shifttime so that the first time measurement in each trajectory is always at t=0. Second, by the Mercator projection method with the center of the ZGGG runway, the latitude and longitude coordinates are converted into east‑north‑up (ENU) coordinates (represented by xyz). Finally, we intercepte trajectory points within a rectangle of -120 kmx120 km and -120 kmy120 km (the rectangle is greater than TMA). And the target airport of all trajectories is ZGGG. However, some aircraft do not update the target airport in time after landing, and the ADS-B receiver does not revise it when recording, so the intercepted data contain aircraft trajectories in other operation states. Therefore, for the intercepted data, a heuristic method is used to further distinguish the landing, takeoff, and overflying trajectories in the terminal airspace.

For each trajectory, we set the index of the closest point to the center of the runway as c=argminipi2, and the index of the farthest point as f=argmaxipi2. The time of the last measurement in the trajectory is T=maxiti, and the average descent speed is z˙avg. The trajectory is then divided as:

Landing if (pc2<3 km)(p02>100 km)(z˙avg<-100 m/min)(tc/T>0.95)

Takeoff if (pc2<3 km)(pT2>100 km)(z˙avg>100 m/min)(tc/T<0.05)

Overflying if (pf2>100 km) (-100 m/ min < z˙avg  < 100  m/min )   (( tf/T<0.05)(tf/T>0.95))

The length of the runway at ZGGG airport is about 3 km, and the longest distance between the surrounding approach sector boundary and the runway center is about 100 km. This method works well with ADS-B data in Central South China.

For these three types of trajectories, the overflying aircraft have only a slight effect on the capacity in the terminal area. The departure aircraft use lots of performance on the initial climb process. Even if the terminal area capacity is limited due to the increased traffic complexity or severe weathers, the control methods of the departure aircraft are often limited to ground delay procedure (GDP) or pre-departure, and the probability of ocurrence of anomalous trajectories is extremely low.

In order to keep sustainable operations in the terminal area, the approaching aircraft are usually tactically controlled based on the standard procedure. The control behavior reflects the complexity of the airspace[

14]. Therefore, we choose the approach trajectory as the experimental data. Since the trajectories containing ATC actions occur in the airborne phase, the point after the approach trajectory data index c is deleted. So far, it is assumed that the experimental data has been separated independently. An example of ZGGG area navigation (RNAV) standard arrival procedure in the terminal area is shown in Fig.1 (available on the eAIP China[18]), in which the black lines with arrows represent the aircraft standard routes.

Fig.1 ZGGG’s RNAV procedure for RWY 19/20L/20R[

18]

1.2 Preprocessing

For the part of the anomalous trajectory detection, the key criterion is the reconstruction error, which is related to the number of trajectories. Therefore, the time is divided equally to get the same amount of trajectory points. The other preprocessing is to scale each measurement by a standardization. The data set is restored and rescaled separately in each experiment to eliminate the effect of noise and anomalous trajectories on the remaining data. In the experiment of trajectory clustering, in order to clarify the difference between trajectories, Gariel’s strategy is duplicated to add three dimensions to each trajectory point[

7]

Τaugm=[t  x  y  z  R  cosα  sinα]

where R=(xi-xref')2+(yi-yref')2 is the top left corner with coordinates of (xref',yref')=(-80,80) km. α=arctan(yi/xi) is the angular position in cylindrical coordinates.

2 Model Learning

This section outlines the steps of anomaly detection model and traffic flow classification. The steps are to reconstruct the trajectories, identify anomalies, and then cluster the rest trajectories.

2.1 Trajectory reconstruction

There are multi-deficiencies with the trajectories in ADS-B data. First, the measurements are noisy. Second, the trajectories can have any number (including zero) measurements at a given time range. Third, trajectories have to share the same number of points. If we used linear interpolation to solve the third problem directly, the influence of noise would be increased. The deep auto-encoder (DAE) is widely used for denoising and anomaly detection[

1920] through compressive reconstruction of raw data. A standard DAE structure is shown in Fig.2, composed of encoder and decoder, which has the same number of units in the output layer and the input layer. And the number of middle hidden layer units is less than the number of both layer units.

Fig.2 Structure of standard DAE network

In the encoding stage, the input data X is compressed into the hidden layer h to eliminate outliers. The connection function follows the standard practice[

21], which uses the logit function to connect units

h=EW,b(X)=logit(WX+bE) (1)

where W is the weight connecting the input layer and the hidden layer and bE the bias of the input layer. Similarly, the decoding function is defined as

DW,b(h)=logit(WTh+bD) (2)

where WT is the weight connecting the hidden layer and the output layer and bD the bias of the hidden layer. So, the objective function of DAE model is denoted as

minW,bX-DW,b(EW,b(X))F2 (3)

However, the DAE model needs to train clear data, which is difficult to achieve due to the error of the hardware system. Thus, the RDAE model is used to reconstruct the trajectories to smooth noisy data. In the RDAE model, the outlier S and the part of the input data LD that is well represented by the hidden layer model are separated into two parts: X=LD+S. The outlier S contains the noise and anomalous vectors that are difficult to reconstruct. Then, the RDAE model can execute denoising and anomaly detection through employing different regularization in the loss function. In order to sparse the noise as much as possible, the loss function should be set as the sum of the reconstruction error LD and the normalization term of the S0. But for the calculation reason, one can relax the combinatorial term of the optimization by replacing it with a convex relaxation S1. The objective function can be denoted as

minW,bLD-DW,b(EW,b(LD))F2+λ1S1 (4)

where F is the Frobenius norm of a matrix. As the parameter of L1 is regularized, a smaller λ1 will promote more data to be isolated into S as noise, and therefore minimize the reconstruction error.

We construct the L1 to regularize RDAE (L1-RDAE) model. The number of input layer units and output layer units is the dimension of the data. The number of units in the middle of the hidden layer is determined by the method of eigen-dimensional estimation. The same λ1 value is selected for all input trajectories to maintain the consistency of denoising. We choose appropriate parameters in RDAE model that have the lowest reconstruction loss on the test data. Fig.3 shows the L1-RDAE reconstructed trajectories that are generated by varying λ1, and the result of the standard DAE network with the same layers and units of RDAE.

Fig.3 Trajectory reconstruction

Fig.3(a) is the original noisy trajectories. A low λ1 leads to smooth trajectories (Fig.3(c)) whereas high λ1 leads to intact trajectories (Fig.3(b)). These trajectories are too smooth with λ1=0.1, and several anomaly trajectories are eliminated in the bottom right corner of Fig.3(c), which increases the reconstruction loss value to 8.85. The optimal validation loss value is 6.71, achieved by λ1=2. Fig.3(d) is the result of DAE, which faithfully, but inappropriately, reproduces the noise.

2.2 Anomaly detection

The anomalous trajectories have different formats with the noise removed in Section 2.1. In Fig.4, a few trajectories interfered by noise are represented in blue, noise position in red circles, anomalous trajectories in green, and the remaining trajectories in light gray. It can be intuitively seen that noise appears as a break point in the trajectories, and anomaly trajectories have some different segments with nominal trajectories.

Fig.4 Visualization of noise and anomaly

The reconstructed trajectory data is processed to a n×300 matrix through linear interpolation. Each row has 75 equal time position points of a flight, and each column is the measurement value of the trajectory in time and spatial dimensions. Anomalous trajectories reflect aircraft behaviors caused by unhealthy airspace[

7], corresponding to the row vector of the data matrix. And the noise is distributed in each trajectory, corresponding to the column vector of the matrix. According to this feature, this paper combine L2,1 regularization and RDAE model (L2,1-RDAE) to identify anomalous trajectories. The L2,1 regularization formula is as follows

X2,1=j=1nxj2=j=1ni=1nxij21/2 (5)

It can be regarded as a L2-norm regularizing member of each column, which amplifies the influence of anomalies in the group to make anomalous vectors obvious. Then L1 regularization is used between each row to reduce the anomalous vectors􀆳 effect on low dimensional manifold, which enhances the robustness of the deep network. The anomalous trajectory data in this paper is reflected on the row vectors, so X needs to be transposed. The objective function is as follows

minW,bLD-DW,b(EW,b(LD))F2+λ2ST2,1 (6)

However, the objective is not convex, and it is difficult to guarantee to converge the method to a global minimum. So we employ the block-wise soft-thresholding function[

20]

(proxλ2,L2,1(x))j=xgj-λ2xgjxg2    xg2>λ2            0                 xg2λ2 (7)

where g is a group index, j a within-group index, and λ2 the regularization parameter of the objective function L2,1. There are anomalous labels marked by the controller. We use false alarm rates to tune parameters. Fig.5 shows the experiment steps of anomaly detection.

Fig.5 Anomaly detection method based on RDAE

In fact, such a semi-supervised training method makes proper λ2 chosen according to the already pattern instances. Thus, the RDAE model trained on LD contains enough information to reconstruct the main traffic flows. Foreshadowing our results in Section 3.1, we note that the linear dimension reduction of PCA on the data may lead to insufficient trajectory details. Meanwhile, deep auto-encoder model can abstract low-dimensional codes that work much better than PCA as a tool to reduce the dimensionality of data[

22]. Using the trained L2,1-RDAE encoder to output nonlinear vectors may produce better result of traffic flow classification than PCA. Thus, in the following clustering section, the combination of RDAE+DBSCAN is added as a comparative experiment to examine this opinion. For the density difference in the trajectory data, we introduce the DP clustering algorithm as a substitute for DBSCAN.

2.3 Clustering by DP method

DP algorithm is a recently published method[

23] for density clustering. The density calculation of each point is related to the density of neighbors. The steps are as follows.

First, the neighbors can be recognized by a soft threshold like the Gaussian kernel function or a hard threshold as defined in Eq.(8). In order to reduce the computational complexity, we employ a hard threshold to calculate the local density

ρi=jx(Dij-dc) (8)

Suppose that a descending order {ρi}i=1N represents the subscript order {qi}i=1N, that is: ρq1ρq2ρqN.

Second, the distance parameter of each data point is calculated, which is measured by the minimum distance between the point and other high-density points. However, the distance of the trajectory point with the highest density is the maximum value of its distance from all the other high-density points, that is

δqi=minqj:j<i{Dqiqj}  i2maxj2{δqi}   i=1 (9)

Therefore, each point is given two quantities: Local density and distance. Points with high local density and distances far greater than the threshold (ρ0,δ0) can be identified as density peaks or cluster centers. The cluster centers in non-conventional pattern should correspond to a small ρ0 and a large δ0. Thus, we can set the reasonable parameters to identify the non-conventional trajectories near the edge of the nominal trajectory data. After these density peaks are found, other points are assigned to the same cluster as their nearest neighbor of higher density.

By this way, DP method can cluster the data contained various densities, which is the weakness of the DBSCAN-cored clustering algorithm[

24]. Furthermore, the qualitatively setting of local density and distance can also identify anomaly trajectories. This is a direction of future work.

It is worth mentioning that the combination of RDAE+DP could improve the performance in the trajectory clustering. Due to the presence of two regularizations, the effect of noise and anomalous trajectories is constrained. Thus, there is a trend that the trajectory data would converge in the low-dimensional manifold, so that the non-conventional trajectories have higher density locally. This makes small clusters that are finely different from the classical data easier to be identified. Fig.6 shows the traffic flow classification step of RDAE+DP algorithm.

Fig.6 Trajectory clustering method based on RDAE+DP

3 Experimental Result & Discussion

We carried out our method on 524 real trajectories at the Guangzhou terminal airspace on 2019-06-16. Twenty-seven anomalous trajectory labels, marked by licensed controllers, were collected for evaluating the performance of our method. This is typical imbalance dataset in the binary classification. We used F1-score to verify the anomaly detection performance. Its definition is as follows

F1=2precisionrecallprecision+recall (10)

where precision is the number of correct anomalous results divided by the number of all anomalous results identified by the algorithm of anomaly detection, and recall is the number of correct anomalous results divided by the number of all anomalous samples marked by controllers. A single high precision or recall cannot correctly reflect the anomaly detection performance. Thus, we visualized the optimal F1-score results of each algorithm to analyze the performance difference.

In the traffic flow classification, we used the same data set and selected the silhouette criterion (SC) value as the evaluation metrics. The SC value definition is as follows

S(i)=b(i)-a(i)max{a(i)-b(i)}i=0,1,2,,nTra (11)

where a(i) is the mean distance between i and all other data points in the same cluster, and b(i) the smallest mean distance of i to all points in any other cluster.

3.1 Anomaly detection

In order to compare the RDAE performance in anomaly detection, we repeated Gariel’s strategy (PCA+ DBSCAN) and trained a standard deep auto-encoder (DAE) model that had the same structure with RDAE. Fig.7 shows the precision, recall and F1-score with different parameters for RDAE and PCA+DBSCAN. Table 1 shows the anomaly detection among three algorithms with the optimal F1-score. The optimal F1-score of RDAE model was 0.809 by λ2=7×10-12. The F1-score of DAE model that had same structure with RDAE was 0.632. The optimal F1-score of PCA+DBSCAN model was 0.593 under the parameters of ε=1.5 and φ=11, where ε was the maximum distance between two samples, and φ the number of samples in a neighborhood. The F1-score of PCA+DBSCAN and DAE model was approximately 36.4% and 28% worse than the value achieved by the RDAE. At the respective optimal F1-score, we found that the detection results of both PCA+ DBSCAN and DAE models were a subset of RDAE.

Fig.7 Anomaly detection results

Table 1 Anomaly recognition results of three algorithms
MethodsIdentified anomalyCorrect anomalyF1‑score
PCA + DBSCAN 27 16 0.593
DAE 39 21 0.632
RDAE 32 24 0.809

Fig.8 displays the complementary set between the results of PCA+DBSCAN and RDAE. They are represented by green trajectories. These aircraft routes shared a large number of segments with standard procedure. The anomaly degree in these trajectories that PCA+DBSCAN could not recognize were relatively small. The dimension reduction process used PCA further decreased the difference between anomaly and nominal trajectories, which made DBSCAN difficult to find these anomaly trajectories. Fig.7(c) indicates that the standard DAE model have the similar F1-score with PCA+DBSCAN in this trajectory data set. The RDAE model obtained a better result than DAE by amplifying the effect of anomalous vectors through L2,1 regularization.

Fig.8 Anomaly complementary set of two algorithms

However, there were three trajectories that still cannot be detected by RDAE. These trajectories are displayed in red in Fig.8. These aircraft flied into the terminal area near the west GYA waypoint at night. The TMA was in a low-density state. The controller carried out a radar vectoring procedure for saving time and assiged a direct route for these aircraft to the final approach point. Unsuccessful identification of the three trajectories may be attributed to that there are a large number of convergence behaviors from different directions before GYA waypoint. The corresponding data vectors had a large fluctuation range, which reduced the anomaly detection accuracy of the RDAE model near this waypoint. The situation simultaneously led to the poor anomaly detection performance of the PCA+DBSCAN model on the west trajectories. The RDAE result of anomaly detection on the west trajectories was better than the PCA+DBSCAN result, which seems to prove the preferable robustness of the RDAE model. The problem could be refined by using the trajectory points before the convergence behavior. Andrienko elaborated that different interception ranges would affect the traffic flow classification[

15]. It is feasible to extract ADS-B data according to the spatial measurement boundary of the terminal area.

3.2 Trajectory clustering

The quantitative results for the dataset are shown in Table 2. The SC value which compares the ratio of intra- and inter-cluster distances for evaluating compactness and separation between them, is the most common metric for clustering validation[

4]. The SC value indicated that RDAE+DP had the best clustering performance among the three algorithms. But the SC value could not reflect the separation of non-conventional traffic flows. So we introduced visualization qualitative analysis method in Fig.9.

Table 2 Trajectory clustering results
MethodThe number of classesSilhouette coefficientF1‑score
PCA + DBSCAN(The optimal F1‑score) 5 0.68 0.593
PCA + DBSCAN(The optimal SC value) 7 0.71 0.391
RDAE + DBSCAN 8 0.80 0.726
RDAE + DP 9 0.84 0.809

Fig.9 Visualization of three algorithms

Although PCA+DBSCAN had the optimal performance in identifying anomalous trajectories (the optimal F1-score), the quantitative SC value was not the optimal result of the algorithm. The visualization is shown in Fig.9(a). The trajectories with obvious differences on the north side of the airport are identified as gray. At the optimal SC value, the north trajectories were divided into three clusters, with three standard RNAV arrival procedures in Fig.1. But a large number of trajectories were categorized as the anomaly, which penalized the F1-score value. There may be two reasons: One is that linear dimension reduction process by PCA decreased necessary details in the trajectory, and the other is that DBSCAN was difficult to precisely separate dataset with tiny density differences. The visualization of RDAE+DBSCAN is shown in Fig.9(c), where a large number of trajectories were not misidentified to anomaly (DBSCAN would re-identify the anomaly, even if RDAE had excluded the anomaly trajectories), and the north trajectory set was divided into four categories. The RDAE+DBSCAN model was able to discriminate the standard RNAV arrival route ATAGA-1B in Fig.1, which could not be separated in PCA+DBSCAN. Meanwhile, the SC value was slightly better than PCA+DBSCAN. The RDAE+DBSCAN model had better clustering performance of identifying the nominal trajectory than PCA+DBSCAN.

Compared with RDAE+DBSCAN, the RDAE+DP combination further improved the quantitative SC value, and separated non-conventional pattern (orange trajectories in Fig.9(d)) from standard RNAV route (dark blue trajectories in Fig.9(d)). This was a direct route only open at night through which these aircraft flied over a restricted area. So the number of cluster trajectories was relatively few, corresponding to low density at microscopic level. In the global trajectory clustering, DBSCAN either clustered it of the same cluster of the nominal trajectories (Figs.9(a) and (c)), or identified it as an anomaly (Fig.9(b)). The RDAE+DP combination could better capture this actual non-conventional pattern and the nominal trajectory, which would contribute to the designing and improvement of flight procedures in the terminal area and provide support to traffic classification in free airspace[

25].

4 Conclusions

(1) The combination of RDAE and L1‑regularization can well eliminate the noise in trajectories, and it is easy to train. In fact, the performance of denoising can potentially be improved by the Huber loss function or L1/2‑regularization[

26]. This is a feasible improvement in the future. And we believe that the trajectory reconstruction procedure could be useful for trajectory smoothing a wide variety of vehicle data.

(2) The method of combining L2,1‑regularization and RDAE can effectively identify the anomalous trajectories in the airspace. Compared with the PCA+DBSCAN and DAE algorithms, the method has 36% and 28% performance in F1-score.

(3) The RDAE model can simultaneously capture non-linear components containing sufficient details in trajectories, so that the combination with DBSCAN can accurately cluster all standard RNAV aircraft routes. The last but not least, the combination of RDAE+DP can identify both the non-conventional spatiotemporal patterns and the nominal trajectories, which could be further used to train machine learning techniques aiming at improving the state-of-the-art of tactical deconfliction and prediction algorithms.

Contributions Statement

Mr. DONG Xinfang complied the models, conducted the analysis, designed the study, interpreted the results and wrote the manuscript. Mr. LIU jixin contributed data, polished the manuscript. Mr. ZHANG Weining contributed model components, proposed revision opinion of manuscript and corrected structure of manuscript. Mr. ZHANG Minghua revised manuscript, contributed analysis criterion. Mr. JIANG Hao drew the Visio diagram. All authors commented on the manuscript draft and approved the submission.

Acknowledgements

This work was supported in part by the Foundation of Graduate Innovation Center in NUAA (kfjj20190707).

Conflict of Interest

The authors declare no competing interests.

References

1

International Air Transport Association. IATA forecasts passenger demand to double over 20 years‍[EB/OL]. (2016-10-18)[2020-06-12]. https://www.iata.org/en/pressroom/pr/2016-10-18-02/. [百度学术

2

FENG Z. Shaping brand new future of civil aviation with intelligence[EB/OL]. (2019-05-17)[2020-06-12]. http://www.caac.gov.cn/en/XWZX/201905/t20190517_196228.html. [百度学术

3

OLIVE X, MORIO J. Trajectory clustering of air traffic flows around airports‍[J]. Aerospace Science and Technology, 2019, 84: 776-781. [百度学术

4

GALLEGO C E V, GOMEZ COMENDADOR V F, SAEZ NIETO F J, et al. Discussion on density-based clustering methods applied for automated identification of airspace flows‍[C]//Proceedings of 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC). London: IEEE, 2018: 1-10. [百度学术

5

SALAUN E, GARIEL M, VELA A E, et al. Aircraft proximity maps based on data-driven flow modeling‍[J]. Journal of Guidance, Control, and Dynamics, 2012, 35(2): 563-577. [百度学术

6

ECKSTEIN A. Automated flight track taxonomy for measuring benefits from performance based navigation‍[C]//Proceedings of 2009 Integrated Communications, Navigation and Surveillance Conference. Crystal City, VA, USA: IEEE, 2009: 1-12. [百度学术

7

GARIEL M, SRIVASTAVA A N, FERON E. Trajectory clustering and an application to airspace monitoring‍[J]. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(4): 1511-1524. [百度学术

8

MURCA M C R, HANSMAN R J. Identification, characterization, and prediction of traffic flow patterns in multi-airport systems‍[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(5): 1683-1696. [百度学术

9

BARRATT S T, KOCHENDERFER M J, BOYD S P. Learning probabilistic trajectory models of aircraft in terminal airspace from position data‍[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(9): 3536-3545. [百度学术

10

ARNESON H, BOMBELLI A, SEGARRA-TORNE A, et al. Analysis of convective weather impact on pre-departure routing of flights from fort worth center to New York center‍[C]//Proceedings of NASA Center for AeroSpace Information (CASI). Hampton: NASA/Langley Research Center, 2017. [百度学术

11

BOMBELLI A, SOLER L, TRUMBAUER E, et al. Strategic air traffic planning with fréchet distance aggregation and rerouting‍[J]. Journal of Guidance, Control, and Dynamics, 2017, 40(5): 1117-1129. [百度学术

12

MARZUOLI A, GARIEL M, VELA A, et al. Data-based modeling and optimization of en route traffic‍[J]. Journal of Guidance, Control, and Dynamics, 2014, 37(6): 1930-1945. [百度学术

13

NANDURI A, SHERRY L. Anomaly detection in aircraft data using recurrent neural networks (RNN)‍[C]//Proceedings of 2016 Integrated Communications Navigation and Surveillance (ICNS). [S.l.]: IEEE, 2016: 5C2-1-5C2-8. [百度学术

14

MELNYK I, BANERJEE A, MATTHEWS B, et al. Semi-Markov switching vector autoregressive model-based anomaly detection in aviation systems‍[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, USA: Association for Computing Machinery, 2016: 1065-1074. [百度学术

15

OLIVE X, BASORA L. Identifying anomalies in past en-route trajectories with clustering and anomaly detection methods‍[C]//Proceedings of ATM Seminar 2019. VIENNE, Austria: ATM, 2019. [百度学术

16

CONDE R M M, DELAURA R, HANSMAN R J, et al. Trajectory clustering and classification for characterization of air traffic flows‍[C]//Proceedings of the 16th AIAA Aviation Technology, Integration, and Operations Conference. Washington, DC: American Institute of Aeronautics and Astronautics, 2016. [百度学术

17

ANDRIENKO G, ANDRIENKO N, FUCHS G, et al. Clustering trajectories by relevant parts for air traffic analysis‍[J]. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(1): 34-44. [百度学术

18

CAAC. eAIP China[EB/OL]. [2020-06-12]. https://www.aischina.com/EN/indexEn.aspx. [百度学术

19

ZHANG W, TAN X. Combining outlier detection and reconstruction error minimization for label noise reduction‍[C]//Proceedins of 2019 IEEE International Conference on Big Data and Smart Computing (BigComp). [S.l.]: IEEE, 2019: 1-4. [百度学术

20

ZHANG W, WANG D, TAN X. Robust class-specific autoencoder for data cleaning and classification in the presence of label noise‍[J]. Neural Processing Letters, 2019, 50(2): 1845-1860. [百度学术

21

LECUN Y, BENGIO Y, HINTON G. Deep learning‍[J]. Nature, 2015, 521(7553): 436-444. [百度学术

22

HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks‍[J]. Science, 2006, 313(5786): 504-507. [百度学术

23

RODRIGUEZ A, LAIO A. Clustering by fast search and find of density peaks‍[J]. Science, 2014, 344(6191): 1492. [百度学术

24

SCHUBERT E, SANDER J, ESTER M, et al. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN‍[J]. ACM Transactions Database System, 2017, 42(3): 1-19, 21. [百度学术

25

OLIVE X, GRIGNARD J, DUBOT T, et al. Detecting controllers’ actions in past mode S data by autoencoder-based anomaly detection‍[C]//Proceedings of SESAR Innovation Days 2018. Salzburg, Austria: [s.n.], 2018. [百度学术

26

XU Z, ZHANG H, WANG Y, et al. L1/2 regularization[J]. Science China Information Sciences, 2010, 53(6): 1159-1169. [百度学术

WeChat

Mobile website