Prediction of PM2.5 Concentration Level Based on Random Forest and Meteorological Parameters

doi:10.3778/j.issn.1002-8331.1709-0378

Abstract

Abstract: Not only does air pollution, especially PM2.5, do harm to people’s physical and mental health, but it also restricts the economic development of cities. In order to forecast the concentration level of PM2.5 in a convenient and accurate way, a prediction model of concentration level of PM2.5 based on random forest is proposed, the feature factors adopt the meteorological data of Taiyuan city from 2013 to 2016, the rule of time sequence of PM2.5 concentration change of the prediction site, and its temporal and spatial correlation with the surrounding sites. Firstly, the K-Means algorithm is applied to cluster the raw meteorological data in order to reduce the correlation between different classifiers. Secondly, the undersampling method is used to balance the dataset so as to reduce the impact of class imbalance on the performance of classifiers. Finally, a predictive model is constructed by using random forest with good generalization ability. By the verification of the real data, the method boasts good recall, precision and F-score in the prediction of the concentration level of PM2.5.

Key words: PM2.5, random forest, meteorological factors, undersampling, prediction

摘要： 空气污染不仅危害人类的身心健康，而且还会制约城市的经济发展，其中PM2.5带来的影响尤为突出。为了方便准确地预测出空气中的PM2.5浓度等级，提出了一种基于随机森林的PM2.5浓度等级预测方法，特征因子采用太原市2013年—2017年的气象数据、预测站点的PM2.5浓度变化的时间规律以及与周围站点的时空关联性。该方法首先利用K-Means算法对原始气象数据聚类，降低不同分类器之间的相关性，然后利用欠采样方法对数据进行平衡采样，减少类不平衡对分类器性能的影响，最后利用泛化能力好的随机森林构建预测模型。经过真实数据验证，该方法对PM2.5浓度等级预测具有较好的精确度、召回率与[F]值。

关键词: PM2.5, 随机森林, 气象因子, 欠采样, 预测

REN Cairong1, XIE Gang1，2. Prediction of PM2.5 Concentration Level Based on Random Forest and Meteorological Parameters[J]. Computer Engineering and Applications, 2019, 55(2): 213-220.

任才溶1，谢刚1，2. 基于随机森林和气象参数的PM2.5浓度等级预测[J]. 计算机工程与应用, 2019, 55(2): 213-220.

[1]	HUANG Dongyi, YANG Bing, WU Zihao, KUANG Jiayi, YAN Zeming. Spatio-Temporal Fully Connected Convolutional Neural Networks for Citywide Cellular Prediction [J]. Computer Engineering and Applications, 2021, 57(9): 168-175.
[2]	YANG Li, WU Yi, WEI Debin, PAN Chengsheng. Satellite Network Traffic Prediction Based on Spatiotemporal Correlation [J]. Computer Engineering and Applications, 2021, 57(7): 101-106.
[3]	CHANG Hao, CHEN Xiaolei, ZHANG Aihua, LI Ce, LIN Dongmei. Continuous Blood Pressure Prediction Based on Improved SENet Convolutional Neural Network [J]. Computer Engineering and Applications, 2021, 57(7): 130-135.
[4]	LIU Ziyan, YUAN Lei, ZHU Mingcheng, MA Shanshan, CHEN Linzhouting. YOLOv3 Traffic sign Detection based on SPP and Improved FPN [J]. Computer Engineering and Applications, 2021, 57(7): 164-170.
[5]	ZHANG Rui, WU Boxiong, ZHANG Liyuan, ZHANG Bo. Human Trajectory Prediction Method for Complex Scenes [J]. Computer Engineering and Applications, 2021, 57(6): 138-143.
[6]	YANG Yemin, ZHANG Huijun, ZHANG Xiaolong. Research on Interpretable Visual Analysis Method of Random Forest [J]. Computer Engineering and Applications, 2021, 57(6): 168-175.
[7]	YANG Fengyu, HUANG Yaxuan, ZHOU Shijian, ZHENG Wei. Survey of Software Defect Prediction Combined with Multi-metrics [J]. Computer Engineering and Applications, 2021, 57(5): 10-24.
[8]	LI Shuo, LIANG Yi. Prediction Model of Execution Time for Batch Application in Spark [J]. Computer Engineering and Applications, 2021, 57(5): 79-87.
[9]	XIONG Jian, QIN Renchao, HE Mengyi, LIU Jianlan, TANG Fengyang. Application of Improved Random Forest Algorithm in Android Malware Detection [J]. Computer Engineering and Applications, 2021, 57(3): 130-136.
[10]	XU Xianfeng, CAI Lulu, ZHANG Li. Photovoltaic Power Generation Prediction Algorithm Based on MLP and DBN [J]. Computer Engineering and Applications, 2021, 57(3): 266-272.
[11]	ZHENG Jianfeng, WANG Yingming. Research on Efficiency Confidence Interval Prediction Model Based on DEA-BP Neural Network [J]. Computer Engineering and Applications, 2021, 57(3): 273-278.
[12]	AN Weichao, YAN Ting, ZHANG Nan, ZHANG Shan, XIANG Jie, CAO Rui, WANG Bin. Application of Pathological Image Texture Analysis in MSI Prediction of Gastric Cancer [J]. Computer Engineering and Applications, 2021, 57(24): 205-211.
[13]	YI Lingzhi, WANG Shitong, YI Fang, DENG Dong, YI Zhimin, JIANG Peng. Wind Farm Ultra-Short-Term Wind Speed Prediction Based on EEMDSE-ILSTM [J]. Computer Engineering and Applications, 2021, 57(22): 288-294.
[14]	CHEN Hai, QIAN Fulan, CHEN Jie, ZHAO Shu, ZHANG Yanping. Rating Prediction Model Based on Variational Auto-Encoder [J]. Computer Engineering and Applications, 2021, 57(22): 153-159.
[15]	WU Minghui, HOU Lingyan, WANG Chao. Improved Mechanism of Prediction-Oriented Long Short-Term Memory Neural Network [J]. Computer Engineering and Applications, 2021, 57(21): 109-115.

Prediction of PM2.5 Concentration Level Based on Random Forest and Meteorological Parameters

基于随机森林和气象参数的PM2.5浓度等级预测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics