基于集成学习投票算法的Android恶意应用检测

doi:10.3778/j.issn.1002-8331.1910-0030

计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (22): 74-82.DOI: 10.3778/j.issn.1002-8331.1910-0030

基于集成学习投票算法的Android恶意应用检测

赵宇鑫，努尔布力，艾壮

1.新疆大学软件学院，乌鲁木齐 830046
2.新疆大学网络中心，乌鲁木齐 830046
3.新疆大学信息科学与工程学院，乌鲁木齐 830046

出版日期:2020-11-15 发布日期:2020-11-13

Android Malware Detection Based on Ensemble Learning Voting Algorithm

ZHAO Yuxin, Nurbol, AI Zhuang

1.College of Software, Xinjiang University, Urumqi 830046, China
2.Network Centre, Xinjiang University, Urumqi 830046, China
3.College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China

Online:2020-11-15 Published:2020-11-13

摘要/Abstract

摘要：

针对Android平台恶意应用的检测技术，提出一种基于集成学习投票算法的Android恶意程序检测方法MASV（Soft-Voting Algorithm），以有效地对未知应用程序进行分类。从已知开源的数据集中获取了实验的基础数据，使用的应用程序集包含213 256个良性应用程序以及18 363个恶意应用程序。使用SVM-RFE特征选择算法对特征进行降维。使用多个分类器的集合，即SVM（Support Vector Machine）、[K]-NN[（K]-Nearest Neighbor）、NB（Na?ve Bayes）、CART（Classification and Regression Tree）和RF（Random Forest），以检测恶意应用程序和良性应用程序。使用梯度上升算法确定集成学习软投票的基分类器权重参数。实验结果表明，该方法在恶意应用程序检测中达到了99.27%的准确率。

关键词: Android恶意应用, 集成学习, 投票算法

Abstract:

Aiming at the detection technology of malicious application on Android platform, an Android malicious program detection method MASV（Soft-Voting Algorithm） based on ensemble learning voting algorithm is proposed to effectively classify unknown applications. The underlying data for the experiment are obtained from a known open source dataset by using an application set of 213 256 benign applications and 18 363 malicious applications. First, the dimensionality of features is reduced by using the SVM-RFE feature selection algorithm. Then, a collection of multiple classifiers is used to detect malicious applications and benign applications, the classifiers include SVM（Support Vector Machine）, [K]-NN [（K]-Nearest Neighbor）, NB（Na?ve Bayes）, CART（Classification and Regression Tree） and RF（Random Forest）. At the same time, the gradient ascent algorithm is used to determine the base classifier weight parameter of the ensemble learning soft vote. The experimental results show that the method achieves an accuracy of 99.27% in malicious application detection.

Key words: Android malicious application, ensemble learning, voting algorithm

赵宇鑫，努尔布力，艾壮. 基于集成学习投票算法的Android恶意应用检测[J]. 计算机工程与应用, 2020, 56(22): 74-82.

ZHAO Yuxin, Nurbol, AI Zhuang. Android Malware Detection Based on Ensemble Learning Voting Algorithm[J]. Computer Engineering and Applications, 2020, 56(22): 74-82.

[1]	吴文龙，周喜，王轶，王保全. WKAG：一种针对不平衡医保数据的欺诈检测方法[J]. 计算机工程与应用, 2021, 57(9): 247-254.
[2]	李莉，纪欣沅，宋嵩. 回环软件缺陷数量预测模型[J]. 计算机工程与应用, 2021, 57(7): 158-163.
[3]	王琴，刘盾. 结合集成学习的序贯三支情感分类方法研究[J]. 计算机工程与应用, 2021, 57(23): 211-218.
[4]	熊霖，唐万梅. 基于异构分类器集成的增量学习算法[J]. 计算机工程与应用, 2020, 56(7): 155-161.
[5]	顾兆军，吴优，赵春迪，周景贤. 流量的集成学习与重采样均衡分类方法[J]. 计算机工程与应用, 2020, 56(6): 86-91.
[6]	徐浩然，许波，徐可文. 机器学习在股票预测中的应用综述[J]. 计算机工程与应用, 2020, 56(12): 19-24.
[7]	王得雪，林意，陈俊杰. 协同训练算法在滚动轴承故障诊断中的应用[J]. 计算机工程与应用, 2020, 56(12): 273-278.
[8]	苏健民，杨岚心，景维鹏. 基于U-Net的高分辨率遥感图像语义分割方法[J]. 计算机工程与应用, 2019, 55(7): 207-213.
[9]	刘树栋，张可. 类别不均衡学习中的抽样策略研究[J]. 计算机工程与应用, 2019, 55(21): 1-17.
[10]	李哲，于梦茹. 基于多种LBP特征集成学习的车标识别[J]. 计算机工程与应用, 2019, 55(20): 134-138.
[11]	徐屹伟，刘政怡，赵悉超. 基于简单帧选择的显著性检测方法[J]. 计算机工程与应用, 2019, 55(20): 177-183.
[12]	余恩泽，努尔布力，于清. 一种基于集成学习的钓鱼网站检测方法[J]. 计算机工程与应用, 2019, 55(18): 81-88.
[13]	安琛，陈阳. 基于集成学习的动态链接预测方法[J]. 计算机工程与应用, 2018, 54(6): 110-114.
[14]	翟夕阳，王晓丹，李睿，贾琪. 基于信息熵的RVM-AdaBoost组合分类器[J]. 计算机工程与应用, 2018, 54(5): 138-143.
[15]	张震，曹天杰. 行为特征值序列匹配检测Android恶意应用[J]. 计算机工程与应用, 2018, 54(24): 97-102.

基于集成学习投票算法的Android恶意应用检测

Android Malware Detection Based on Ensemble Learning Voting Algorithm

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics