SOM+k-means两段聚类煤质大数据挖掘方法与应用

韩东辉; 唐跃刚

doi:10.13199/j.cnki.cst.2021-1048

SOM+k-means两段聚类煤质大数据挖掘方法与应用

Coal quality big data mining method and application based on SOM plus K-means two-stage clustering

摘要

摘要: 充分利用煤炭开发利用过程中积累的海量煤质数据，挖掘其中隐含的信息，可以产生新的信息，应用于社会生产建设。不同地质条件煤炭资源优势区域会呈现一种数据分布聚类现象。选取山西省六大煤田太原组原煤水分（M_ad）、灰分（A_d）、挥发分（V_daf）以及全硫（S_t,d）四个参数，将原始数据进行预处理后的数据采用SOM+K-means算法处理，将读取的数据首先基于自组织神经网络SOM处理，将得到的结果作为第二阶段k-means聚类分析算法进一步处理。依据国家相关标准将两类数据中按照原煤质量不同展布到地图上，划定优势区域。数据挖掘结果表明，第一类、第二类聚类中优质煤及中质煤优势区域面积所占比例分别为90.1%和24.1%，说明第一聚类相较于第二聚类原煤质量高。由此证明数据挖掘算法煤质大数据分析的可能性，拓展了煤质数据的使用方式，为煤质数据库使用与发展提供了新的思路。

Abstract: In the process of developing and utilizing coal resources, a large amount of data is generated, and this data contains a lot of potentially valuable information. Making full use of the massive coal quality data accumulated in the process of coal development and utilization and mining the hidden information can generate new information and apply it to social production and construction. Areas with advantageous coal resources under different geological conditions will present a clustering phenomenon of data distribution. Four parameters of raw coal, including moisture (M_ad), ash yield (A_d), volatile matter (V_daf) and total sulfur (S_{t, d}) of the Taiyuan Formation in the six major coal fields in Shanxi Province are selected. the raw data is preprocessed using SOM+K- Means algorithm processing, and the read data is first processed based on the self-organizing neural network SOM, and the result obtained is used as the second stage k-means clustering analysis algorithm for further processing. According to the relevant national standards, the two types of data are displayed on the map according to the different quality of the raw coal, and the advantageous areas are delineated. The results of data mining show that the dominant areas of high-quality coal and medium quality coal account for 90.1% and 24.1% in the first and second clusters, respectively, indicating that the first cluster has higher quality coal than the second cluster. This proves the possibility of data mining algorithm coal quality big data analysis, expands the use of coal quality data, and further provides new ideas for the use and development of coal quality databases.

HTML全文

参考文献(34)

施引文献

资源附件(0)