Abstract:
Under complex geological conditions, it is difficult to accurately determine the source of mine water samples by relying only on the traditional hydrochemical feature similarity discrimination method. Accordingly, by integrating machine learning algorithms with optimization algorithms, a novel discrimination model for mine water inrush sources based on the KPCA-PSO-RF framework is proposed. Firstly, based on the analysis of the hydrochemical characteristics of groundwater in the main aquifers, 11 hydrochemical indexes (K
++Na
+, Ca
2+, Mg
2+, Cl
−, \mathrmSO_4^2- , \mathrmHCO_3^- , \mathrmCO_3^2- , TDS, TH, TA, pH) are selected as the characteristic parameters of water source discrimination on the basis of analyzing the hydrochemical characteristics of groundwater in the main aquifers. Three main indexes are extracted by kernel principal component analysis (KPCA) as the discriminant factors of model identification. Then, the particle swarm optimization (PSO) algorithm is used to iteratively optimize the hyperparameters of random forest algorithm (RF). Finally, 96 groups of groundwater sample data are divided into training samples and test samples according to 7: 3 for training. The KPCA-PSO-RF model is established, and the discriminant results are compared with the RF, PSO-RF and KPCA-GridSearchCV-RF models. The results indicate that: After 5-fold cross validation, the Min-Max normalization method and KPCA algorithm are used to reduce the dimension of hydrochemical data and extract the first three principal components, which can effectively eliminate the redundancy and overlap between samples and make up for the limitation that traditional PCA algorithm cannot deal with complex nonlinear samples; PSO is used to iteratively optimize RF hyperparameters, and the optimal combination ( n_estimators = 54, max_depth = 11, min_samples_split = 7) is determined after 103 iterations, which can enhance the accuracy of model classification and eliminate the blindness of parameter setting; Compared with the relevant discriminant models, the KPCA-PSO-RF model is superior to the comparative model in terms of accuracy (96.55%), accuracy (97.32%), recall (98.21%), and F
1 score (0.965 9). The generalization is better and the accuracy is higher. The data of 10 groups of mine water samples in Yangcheng Coal Mine were input into the trained model, and the discriminant results were consistent with the measured results of water inrush on site. It was accurately judged that the water inrush source of 1308 working face was Ordovician limestone aquifer, and the water inrush source of 3305 working face was three limestone aquifer, which realized accurate discrimination.