Pyramid Pooling-Based Vision Transformer for Tool Condition Recognition
Article
Figures
Metrics
Preview PDF
Reference
Related
Cited by
Materials
Abstract:
This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control (CNC) machining processes and improve tool utilization efficiency. Traditional tool monitoring methods that rely on empirical knowledge or limited mathematical models struggle to adapt to complex and dynamic machining environments. To address this, we implement real-time tool condition recognition by introducing deep learning technology. Aiming to the insufficient recognition accuracy, we propose a pyramid pooling-based vision Transformer network (P2ViT-Net) method for tool condition recognition. Using images as input effectively mitigates the issue of low-dimensional signal features. We enhance the vision Transformer (ViT) framework for image classification by developing the P2ViT model and adapt it to tool condition recognition. Experimental results demonstrate that our improved P2ViT model achieves 94.4% recognition accuracy, showing a 10% improvement over conventional ViT and outperforming all comparative convolutional neural network models.
Keywords:
Project Supported:
This work was supported by China Postdoctoral Science Foundation (No.2024M754122), the Postdoctoral Fellowship Program of CPSF (No.GZB20240972), the Jiangsu Funding Program for Excellent Postdoctoral Talent (No.2024ZB194), Natural Science Foundation of Jiangsu Province (No.BK20241389), Basic Science Research Fund of China (No.JCKY2023203C026), and 2024 Jiangsu Province Talent Programme Qinglan Project.