Abstract:
Excavators are important equipment in open-pit mining. Accurate analysis and identification of excavator operating behaviors are of great significance for improving loading efficiency and enhancing operational safety. Given the continuous visual feature patterns of excavator operations, this article proposes a machine vision-based framework for excavator behavior recognition. First, a multi-scale feature fusion-based excavator detection model, YOLOv5s–GDN, is introduced. This model integrates the Gather-and-Distribute Mechanism (GD) and the Neighborhood Weighted Decomposition (NWD) loss function. The captured video is segmented into consecutive frames and processed by this module to obtain the bounding boxes and positional data of the excavator bucket. Second, the DeepSort tracking module assigns ID numbers to the buckets and extracts their coordinates and trajectory information. Finally, the SlowFast–SL action recognition module is proposed by incorporating the Smooth L1 loss function, enabling precise identification of excavator operational behaviors. Experimental results demonstrate that the proposed excavator detection model achieves improvements of 0.69%, 2.3%, and 4.69% in mAP, precision, and recall, respectively, compared to the YOLOv5s. Compared with YOLOv8s and YOLOv10s, the YOLOv5s–GDN model achieved improvements in mAP by 0.129% and 0.269%, respectively, while maintaining advantages in inference speed and floating-point operations. In terms of action recognition, the SlowFast–SL model achieved an average classification accuracy of 98.4%, significantly outperforming C3D’s 92.6%, I3D’s 94.3%, TSN’s 96.4%, ResNet34+LSTM's 92.3%, and TimeSformer’s 96.6%. The excavator behavior recognition model proposed in this article achieves higher accuracy in predicting different action types, enabling precise identification of excavator operating behavior and providing a foundation for equipment efficiency analysis.