Abstract
The lack of key materials has emerged as one of crucial factors affecting the execution of helicopter assembly production plans. Accurate material delivery time prediction can guide assembly production planning and reduce frequent changes caused by material shortages. A lifelong learning-based model for predicting delivery time of materials is proposed on the basis of internal data sharing within the helicopter factory. During real-time prediction, the model can store new memories quickly and not forget old ones, which is constructed by gated recurrent unit (GRU) network layer, ReLU activation layer, and fully connected layers. To prevent significant precision degradation in real-time prediction, a regularization parameter constraint method is proposed to adjust model parameters. By using this method, the root mean square error (RMSE) in the model’s prediction on the target domain data is reduced from 0.032 9 to 0.013 4. The accuracy and applicability of the model for real-time prediction in helicopter assembly is validated by comparing it with methods such as L2 regularization and EWC regularization, using 25 material orders.
Due to various disturbance events, frequent adjustment of production plans due helicopter assembly line has become a prominent proble
According to literatures, predictive methods for material delivery time include support vector regression, decision tree regression, case-based reasoning, neural networks, etc. Lu et al
Material delivery time prediction involves predicting future delivery time using time series forecasting techniques. This can be achieved using both statistical and machine learning-based methods. Statistical-based methods, such as autoregressive model
Deep learning methods not only perform well in solving nonlinear and non-stationary data, but also handle prediction problems with large data sets and complex features. Ref.[
Although previous research provides technical support for improving the accuracy of material delivery time, real-time prediction under fluctuation in data distribution is relatively underexplored. Technologies such as intelligent recognition, advanced control, and intelligent sensing provide technical support for data collection in the workshop. The real-time data in the machining process of the machining workshop has a high dimensionality and is non-linear. This paper analyzes the relevant factors that influence material delivery time, establishes a regression prediction model, and compares the prediction performance of popular time-series models on the dataset. To address the issue of reduced assembly efficiency caused by inaccurate material delivery time, a material delivery time prediction model is proposed. In practice, the operating law of the workshop changes dynamically over time, thus the data distribution representing the state of the material changes accordingly, and we add a lifelong learning approach to the model. This model has the ability to learn and accumulate knowledge over time within its neural network, surpassing the limitations of traditional prediction models that can only provide accurate predictions on similarly distributed data.
In this article, a specific Chinese helicopter manufacturing plant is studied, and

Fig.1 Diagram of material delivery in a helicopter manufacturing plant
Define the remaining material delivery time as from the current moment until the material is delivered to the corresponding assembly station. The material remaining delivery time (MRDT) is defined as
(1) |
where represents the remaining time to completion, the time required for delivery from mechanic processing workshop to warehouse, and the time of transportation via conveyor belt. As transportation route and speed remain constant, we consider the time required for material delivery as a constant value.
The warehouse system only maintains records of the current status of each inventory item and cannot accurately show the delivery time for non-inventoried materials. The precision and promptness of material delivery are vital to the efficient operation of assembly tasks. Predicting the delivery time of materials beforehand can transform management mode form post-adjustments to pre-adjustments.
The materials required are processed by the corresponding machining workshop, which has machining equipment with a total number of M. The primary objective of this article is to forecast critical shortages of essential materials which were frequently encountered during statistical analysis in history. The duration of storage in the warehouse after the completion of material processing is not taken into account for the remaining delivery time.
The machining process discussed in this article follows the principle of processing different materials along predetermined routes, with each machine processing only one material at a time. During equipment operation, incoming materials are required to be queued in the input buffer area waiting for processing. Upon completion of the machining process, the finished parts will directly enter the output buffer area, awaiting transportation to the next machine facility. The input and output buffer areas operate on a principle of first in first out for processing. There are five main factors that influence the delivery time of materials.
The uncertain equipment status (ES) of the machine can impact the required machining time. Considering the sufficient buffer capacity and ample number of AGVs, the waiting time due to material transport is eliminated. Hence, only the machine status is considered during data collection.
(2) |
where denotes equipment operating status at time T, n machine number, average machine utilization rate from the nearest completed material order to the current time T, and machine n’s continuous working time at time T.
BQS represents the storage status of the queue in the buffer area, which affects the waiting time of materials. The queue information reflects details regarding the processing route and order of materials. The BQS queue information is shown as
(3) |
where represents the queue information for the buffer area of all machine tools at time T, the type of the ith material entering buffer area queue for machine tool n at time T, and the type of the ith material leaving the buffer area queue for machine tool n at time T.
The order information (OIF) describes the composition of materials involved in the processing order, including material identification and quantities. The number and type of materials in the order play a determining role in the overall processing time of the order.
(4) |
where represents the type of the ith material and X the number of material type in the workshop.
In-process information (IPI) is determined by the type of work-in-progress and the accumulated processing time for the work-in-progress on the machine, which in turn determines the remaining processing time on that machine. represents the type of material being processed on machine n at time T, while represents the duration of processing that material on machine n up to time T.
(5) |
In addition, statistics on the completion of the current order are also required, including the type of material quantities already completed and the remaining processes of unfinished processing, which is shown as
(6) |
where represents the order completion status at time T, the process of material c being completed at time T and the remaining processing steps of material c at time T.
Therefore, the feature dataset required for material delivery time prediction model can be represented by
(7) |
where FD represents the feature set of predicted material delivery time. And a deep neural network is utilized to perform regression prediction on material delivery time (MDT) using the features extracted from FD.
(8) |
In order to solve the problems of uncertainty of material delivery time in the shop floor and the easy failure of fixed models in real-time prediction, we propose a lifelong learning-based framework. The architecture of the material delivery time prediction model is based on the GRU network and incorporates lifelong learning to address the issue of reduced prediction accuracy caused by data distribution fluctuations over time. The model’s structure is illustrated in

Fig.2 Material delivery time prediction framework
In that case, define the data before the distribution changes as the source domain, and after the change as the target domain. The relevant information from the source domain is preserved to constrain the parameter changes in the training of the target domain, thus enabling the model to make predictions on data with different distributions.
Feature scaling is utilized as a crucial data preprocessing step to normalize features with disparate scales, which can help to alleviate the imbalanced sample distributions. The defining formula is shown as
(9) |
where is the normalized value of the data and the data before normalization; and denote the maximum and minimum values of the corresponding features, respectively.
A time series prediction model is specifically designed for the delivery forecasting of materials, as it involves the regression prediction of time series data. GRU is not only able to learn quickly, but also suitable for scenarios especially when the sequences are relatively short. GRU is a type of recurrent neural network that focuses on addressing the issues of vanishing and exploding gradients, and can also solve the problem of information loss in traditional RNN networks. Its main characteristic is the introduction of gate mechanisms. Through gate mechanisms, GRU can selectively “retain” or “forget” input data, thus achieving remembering and forgetting of information. In GRU, the state of each unit can be weighted by the controller, which includes update gate (), reset gate (), and candidate state ().
The update gate functions control the extent to which input data modifies the current state, with a higher value indicating a stronger incorporation of previous state information.
(10) |
The reset gate is used to control the impact of the unit’s historical information on the current state. The smaller the reset gate value is, the less previous state information is being incorporated. If the correlation between the previous state information and the current input is weak, the reset gate is triggered.
(11) |
Once the update gate and the reset gate are computed, the GRU calculates the candidate state by using
(12) |
The output of GRU at time step t is given as
(13) |
where represents the input vector, the hidden state vector from the previous time step, the update gate vector, the reset gate vector, the new candidate hidden state vector, the hidden state vector at the current time step, and the sigmoid function, whose range falls within (0,1). denotes an element-wise multiplication operation. , and represent the weight matrices for input, transition from the previous time step’s hidden state vector, and the bias vector, respectively.
GRU has shown great predictive performance in sequential data prediction. It is usually trained on a dataset and then its network parameters are frozen before it is deployed in the target application. As time elapses and the distribution of data changes, the performance of neural networks may deteriorate. To adapt to changes in data distribution, it is necessary to make adjustments to the parameters of the neural network for mitigating the risks of overfitting and catastrophic forgetting. If source domain data and target domain data are mixed during model adjustments, a large amount of storage space will be consumed.
Artificial intelligence predominantly relies on fixed datasets and stationary environments. In general, the input data to the model consists of production data obtained from static condition
(14) |
where H denotes all parameters in the network, the parameter matrix, and the gradient matrix of the loss function with respect to . After calculating the importance of parameters, we can use them to impose constraints on different parameters in training of the target domain. Additionally, the original loss function is replaced with a proxy loss function in the target domain shown as
(15) |
(16) |
where represents the proxy loss function, the original loss function on the target domain, and the penalty term and dynamically adjustment of the value to balance the penalty and original loss functions. is an additional damping factor that prevents the occurrence of excessively large or small values of . denotes the parameter values after training on source domain, while denotes those used for training on target domain. By utilizing the proxy loss function and target domain data, we retrain the model by initializing the parameters with the trained parameters of the source domain and update them using the gradient optimization. The model generated from this approach possesses the capability to cater to the prediction requirements of both the source and target domains, constantly accumulating knowledge via this methodology, and eventually extending predictions to encompass diverse data distributions within the workshop. The parameter update process of the model is illustrated in

Fig.3 Model parameter update process
As an instance, information relevant to the materials necessary for the assembly of a specific helicopter is gathered. 25 material orders corresponding to the frame, beam, and horizontal components located at the mid-end of the fuselage are selected for validation. Utilizing the data pertaining to materials required, a set of 15 material orders are selected as the source domain, while 10 orders are chosen as the target domain.
The dataset is preprocessed, and through comparative experiments, the GRU neural network model shows the best performance in predicting the delivery time of materials. The optimal solution for hyperparameters is through the approach of a tree-structured Parzen estimator (TPE). It is a Bayesian optimization algorithm based on tree structure, which is used to solve the global optimization problem of black box function. As follows, epoch=100, batch size=256, learning rate=0.000 1, time step = 6, and the threshold = 0.000 002 84. The hidden layer structure of the neural network is 400‑215. The Adam optimizer is employed to adjust the parameters in the model. During training, the loss function is measured using RMSE, which is in line with the original data unit and makes it easier to interpret the performance of the model on the data. The formula for RMSE is shown as
(17) |
where the notation represents the true values, the predicted values, and N the total number of data in the batch.
To comprehensively evaluate the performance of the model, we also use R-square (
(18) |
(19) |
(20) |
The advantage of GRU predictions is validated by comparing its performance to other time series forecasting models available on the dataset. The predicted results after training are presented below.

Fig.4 Comparison of predictive performance across different methods
Model | Source domain validation data | Target domain validation data | ||||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | SMAPE/% | RMSE | MAE | SMAPE/% | |||
Transformer | 0.032 2 | 0.987 3 | 0.025 4 | 12.98 | 0.070 8 | 0.940 6 | 0.055 7 | 24.90 |
DNN | 0.012 2 | 0.998 5 | 0.007 7 | 4.02 | 0.033 5 | 0.986 7 | 0.024 3 | 10.82 |
TCN | 0.017 6 | 0.996 2 | 0.012 9 | 8.51 | 0.084 1 | 0.912 1 | 0.072 8 | 17.50 |
GRU | 0.006 1 | 0.999 6 | 0.005 0 | 3.22 | 0.032 9 | 0.987 0 | 0.023 3 | 10.27 |
The application of the uniform manifold approximation and projection (UMAP) algorithm to two distinct datasets results in the reduction of their dimensionality, allowing for comparison of their distributions. This analysis reveals significant differences, which are illustrated in

Fig.5 Data distribution of source and target domains
The concept of lifelong learning is incorporated into parameter updates to prevent catastrophic forgetting of previous data distributions. Preserving the parameters and gradients of the original model and computing the importance matrix of the parameters, the model is then retrained on the target domain via a proxy loss function. In
Metric value | Target domain data | Source domain data | ||||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | SMAPE/% | RMSE | MAE | SMAPE/% | |||
GRUPGC | 0.013 4 | 0.997 8 | 0.008 4 | 5.02 | 0.013 3 | 0.997 9 | 0.010 8 | 6.34 |
GRUPG | 0.013 8 | 0.997 7 | 0.008 8 | 5.00 | 0.013 4 | 0.997 9 | 0.010 8 | 8.92 |
GRUL2 | 0.015 3 | 0.997 2 | 0.009 8 | 5.43 | 0.014 3 | 0.997 6 | 0.011 4 | 7.22 |
GRUEWC | 0.013 2 | 0.997 9 | 0.008 1 | 4.64 | 0.014 5 | 0.997 5 | 0.011 5 | 6.42 |
GRUSI | 0.014 1 | 0.997 6 | 0.009 2 | 5.37 | 0.015 2 | 0.997 2 | 0.012 2 | 8.01 |
GRUMAS | 0.014 2 | 0.997 6 | 0.009 0 | 5.32 | 0.013 4 | 0.997 9 | 0.010 8 | 6.74 |
Based on the results shown in

Fig.6 GRUPGC loss functions for training and validation sets on source and target domains
Figs.

Fig.7 Validation effect of GURPGC on the target domain after training on the source domain

Fig.8 Validation effect of GURPGC on the target domain after training on the target domain

Fig.9 Validation effect of GURPGC on the source domain after training on the target domain
The following conclusions can be drawn from the aforementioned experiments:
(1)In time series prediction models, GRU yields higher accuracy in predicting material delivery time in the studied helicopter workshop.
(2)In practice, the state data of the workshop dynamic changes over time. Fixed parameter models experience a gradual decrease in accuracy in real-time prediction as the data distribution changes.
(3)The proposed method has the advantage of utilizing adaptive parameter C to trade-off between the allowed forgetting and the new task loss. During the training, the model can learn the importance weights. When learning the new distribution data, changes to important parameters are penalized.
(4)Incorporating the concept of lifelong learning, the prediction model exhibits high reliability and adaptability. This provides data support to achieve coordination among material processing, component assembly, and production planning departments, thereby enhancing the feasibility and achievability of helicopter assembly plans.
This paper focuses on predicting delivery time for helicopter assembly materials with a GRU-based model incorporating lifelong learning. First, by comparing the prediction effects of different time-series prediction models, it is concluded that GUR predicts well on material prediction. Then, it is confirmed that the distribution of data changes dynamically with time by using dimensionality reduction. Finally, we compare different regularized lifetime learning methods. By utilizing regularized constraints to fine-tune model parameters, the model can not only make predictions on new data, but also prevent catastrophic forgetting. The predicted results can be used to guide the production plan in the assembly workshop, reduce the frequency of changes to the production plan to some extent, lead to a pre-intervention mode and improve production efficiency in the workshop. In future work, the model’s network structure can be pruned or expanded based on predicted data to ensure that the model completes predictions with the smallest possible structure.
Contributions Statement
Ms. MA Lijun built the model, conducted the analysis, interpreted the results and wrote the manuscript. Mr. YANG Xianggui provided training data for the recurrent network model. Prof. GUO Yu designed and conceptualized the study. Mr. TONG Zhouqiang provided data for the analysis of lifelong learning. Dr. HUANG Shaohuang conducted the experimental design. Dr. LIU Daoyuan participated in the discussion of the research background. All authors commented on the manuscript draft and approved the submission.
Conflict of Interest
The authors declare no competing interests.
References
BIN L, QIANG L. Dynamic scheduling of aircraft mobile production line considering material supply interference[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(8): 1521-1534. [Baidu Scholar]
HANSON R, BROLIN A. A comparison of kitting and continuous supply in in-plant materials supply[J]. International Journal of Production Research, 2013, 51(4): 979-992. [Baidu Scholar]
CAO Yuanchong, XIONG Hui, ZHUANG Cunbo, et al. Dynamic scheduling of complex product discrete assembly workshop based on digital twin[J]. Computer Integrated Manufacturing Systems, 2021, 27(2): 557-568. (in Chinese) [Baidu Scholar]
LU Bin, LU Zhiqiang, ZHANG Yongfeng. Dynamic scheduling problem of aircraft assembly based on material delivery date prediction[J]. Computer Integrated Manufacturing Systems, 2022, 28(9): 2939-2952. (in Chinese) [Baidu Scholar]
CHEN Y S, CHENG C H, LAI C J. Extracting performance rules of suppliers in the manufacturing industry: An empirical study[J]. Journal of Intelligent Manufacturing, 2012, 23(5): 2037‑2045. [Baidu Scholar]
LOUVROS P, STEFANIDIS F, BOULOUGOURIS E, et al. Machine learning and case-based reasoning for real-time onboard prediction of the survivability of ships[J]. Journal of Marine Science and Engineering, 2023, 11(5): 890. [Baidu Scholar]
REN Yinghui, HUANG Xiangming, MA Zhongkai, et al. Time node prediction method for material delivery based on information entropy[J]. China Mechanical Engineering, 2018, 29(22): 2725-2732. (in Chinese) [Baidu Scholar]
LYU D F, HU Y W. Remaining life prediction of aeroengine based on principal component analysis and one-dimensional convolutional neural network[J]. Transactions of Nanjing University of Aeronautics and Astronautics, 2021, 38(5): 867-875. [Baidu Scholar]
PENG Y, WANG H, MAO L M, et al. Flow prediction method for terminal area in convective weather based on deep learning[J]. Transactions of Nanjing University of Aeronautics and Astronautics, 2021, 38(4): 634-645. [Baidu Scholar]
ZHANG Rui, JIA Hu. Production performance forecasting method based on multivariate time series and vector autoregressive machine learning model for waterflooding reservoirs[J]. Petroleum Exploration and Development, 2021, 48(1): 175-184. (in Chinese) [Baidu Scholar]
BENVENUTO D, GIOVANETTI M, VASSALLO L, et al. Application of the ARIMA model on the COVID-2019 epidemic dataset[J]. Data in Brief, 2020, 29: 105340. [Baidu Scholar]
ZHANG F J, WANG S L, LIANG J, et al. Study on multiple linear regression prediction of bending fatigue life of steel wire rope[C]//Proceedings of Journal of Physics: Conference Series. [S.l.]: IOP, 2022, 2230(1): 012007. [Baidu Scholar]
WANG Lei, ZHANG Ruiqing, SHENG Wei, et al. Regression prediction and anomaly data detection based on support vector machine[J]. Chinese Journal of Electrical Engineering, 2009, 29(8): 92-96. (in Chinese) [Baidu Scholar]
LIU C, HU Z, LI Y, et al. Forecasting copper prices by decision tree learning[J]. Resources Policy, 2017, 52: 427-434. [Baidu Scholar]
WEI Qin, CHEN Shijun, HUANG Weibin, et al. Spot market clearing price prediction method using random forest regression[J]. Proceedings of the CSEE, 2021, 41(4): 1360-1367,1542. (in Chinese) [Baidu Scholar]
FANG W, CHEN Y, XUE Q. Survey on research of RNN-based spatio-temporal sequence prediction algorithms[J]. Journal on Big Data, 2021, 3(3): 97-110. [Baidu Scholar]
YU Y, SI X, HU C, et al. A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural Computation, 2019, 31(7): 1235-1270. [Baidu Scholar]
CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL]. (2014-09-03). https://doi.org/10.48550/arXiv.1406.1078. [Baidu Scholar]
WANG Y,QIAO B. A lightweight temporal convolutional network for human motion prediction[J]. Transactions of Nanjing University of Aeronautics and Astronautics, 2022, 39(S1): 150-157. [Baidu Scholar]
GANG Y, XIAO L, CHANG Z, et al. Multi-objective coupling optimization of electrical cable intelligent production line driven by digital twin[J]. Robotics and Computer-Integrated Manufacturing, 2024(86): 102682. [Baidu Scholar]
GU S, FENG Y. Investigating catastrophic forgetting during continual training for neural machine translation[EB/OL]. (2020-11-30). https://doi.org/10. 48550/arXiv.2011.00678. [Baidu Scholar]
ZENKE F, POOLE B, GANGULI S. Continual learning through synaptic intelligence[C]//Proceedings of International conference on machine learning. [S.l.]: PMLR, 2017: 3987-3995. [Baidu Scholar]
ALJUNDI R, BABILONI F, ELHOSEINY M, et al. Memory aware synapses: Learning what (not) to forget[C]//Proceedings of the European Conference on Computer Vision (ECCV). [S.l.]:[s.n.], 2018: 139-154. [Baidu Scholar]
Authors Ms. MA Lijun received the B.S. degree in Mechanical Engineering from Shaanxi University of Science and Technology. She is currently a graduate candidate at College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics (NUAA). Her research interests focus on industrial big data and digital manufacturing. [Baidu Scholar]
Dr. HUANG Shaohua received the B.S. and M.S. degrees in Mechanical Manufacturing and Automation from Xinjiang University in 2011 and 2014, respectively, and the Ph.D. degree in Aerospace Manufacturing Engineering at NUAA in 2019. From 2020 to 2022, he worked as a postdoctoral fellow in the Department of Industrial Engineering at Tsinghua University. He joined in NUAA in July 2022, where he is an assistant professor of Intelligent Manufacturing Laboratory. His current research interests include the Internet of things, industrial big data, and production process optimisation. [Baidu Scholar]