掘进巷道狭长受限空间的掘锚设备避碰路径规划方法研究

杨文娟; 张冉; 张旭辉; 田思昊; 王泽尧; 郑西利; 任志腾; 万继成; 杜昱阳; 张寒冰

doi:10.12438/cst.2024-0998

掘进巷道狭长受限空间的掘锚设备避碰路径规划方法研究

Research on collision avoidance path planning method for mining and anchoring equipment in narrow and restricted space of tunneling laneways

摘要

摘要: 针对煤矿井下狭长受限空间条件下掘锚设备协同作业过程中的碰撞检测与避碰路径规划难题，提出了基于深度强化学习 (Deep Reinforcement Learning，DRL) 的煤矿掘进巷道掘锚设备碰撞检测与避碰路径规划方法。利用激光雷达将巷道环境进行实时重建，在虚拟环境中建立掘进设备与钻锚设备的路径规划训练模型，在构建的掘进工作面虚拟三维场景下，采用混合层次包围盒法进行掘锚设备、钻锚设备以及掘进巷道间的虚拟碰撞检测。针对掘锚设备的运动特性，在SAC (Soft Actor-Critic) 算法的基础上引入多智能体经验共享机制，提出了MAES-SAC (Multi-Agent Experience Sharing) 算法，通过定义智能体的状态空间和动作空间，设计相应的奖惩机制，对智能体进行训练。仿真结果表明，相比于PPO算法和SAC算法，MAES-SAC算法平均奖励值分别提高了8.21%与7.43%，最高奖励值分别提高了0.25%与0.14%，达到最高奖励值的步数分别缩短与3.06%和6.63%，标准差分别减少了10.07%与6.99%。最后，搭建了掘锚设备避碰路径规划与碰撞感知系统实验平台，通过虚实运动同步性测试和掘锚设备避碰轨迹规划实验，验证了掘锚设备避碰路径规划的可行性和准确性，该方法为煤矿井下掘进设备群碰撞感知与协同避碰路径规划提供了新的思路，对推动煤矿井下掘进工作面智能化建设具有重要意义。

Abstract: Addressing the challenges of collision detection and collision avoidance path planning during the collaborative operation of mining and anchoring equipment in the narrow and restricted spaces of underground coal mines, this paper proposes a method for collision detection and collision avoidance path planning for mining and anchoring equipment in tunneling lanes based on Deep Reinforcement Learning (DRL). LiDAR is utilized for real-time environmental reconstruction of the tunnel, and a path planning training model for mining and drilling equipment is established in a virtual environment. In the constructed three-dimensional virtual scene of the mining face, a hybrid hierarchical bounding box method is applied for virtual collision detection among mining and anchoring equipment, drilling and anchoring equipment, and the tunnel itself. Considering the motion characteristics of the mining and anchoring equipment, this paper introduces a Multi-Agent Experience Sharing mechanism on the basis of the Soft Actor-Critic (SAC) algorithm, proposing the MAES-SAC algorithm. By defining the state space and action space of the agent and designing a corresponding reward and punishment mechanism, the agent is trained. Simulation results indicate that, compared to the PPO algorithm and the SAC algorithm, the MAES-SAC algorithm has improved the average reward value by 8.21% and 7.43% respectively, increased the maximum reward value by 0.25% and 0.14% respectively, reduced the steps to reach the maximum reward value by 3.06% and 6.63% respectively, and decreased the standard deviation by 10.07% and 6.99% respectively. Finally, an experimental platform for collision avoidance path planning and collision perception system for mining and anchoring equipment is constructed. Through virtual-physical motion synchronization testing and collision avoidance trajectory planning experiments, the feasibility and accuracy of the collision avoidance path planning for mining and anchoring equipment are verified. This method provides a new approach for collision perception and collaborative collision avoidance path planning of mining equipment groups in underground coal mines, which is of significant importance for promoting the intelligent construction of mining faces in underground coal mines.

HTML全文

参考文献(29)

施引文献

资源附件(0)