Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (20): 254-265.DOI: 10.3778/j.issn.1002-8331.2210-0230
• Big Data and Cloud Computing • Previous Articles Next Articles
YIN Chunyong, CHEN Shuangshuang
Online:
2023-10-15
Published:
2023-10-15
尹春勇,陈双双
YIN Chunyong, CHEN Shuangshuang. Data Stream Classification Method Combining Micro-Clustering and Active Learning[J]. Computer Engineering and Applications, 2023, 59(20): 254-265.
尹春勇, 陈双双. 结合微聚类和主动学习的流分类方法[J]. 计算机工程与应用, 2023, 59(20): 254-265.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2210-0230
[1] GAMA J.Knowledge discovery from data streams[M].[S.l.]:CRC Press,2010. [2] DOMINGOS P,HULTEN G.Mining high-speed data streams[C]//Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2000:71-80. [3] WIDMER G,KUBAT M.Learning in the presence of concept drift and hidden contexts[J].Machine Learning,1996,23(1):69-101. [4] 文益民,刘帅,缪裕青,等.概念漂移数据流半监督分类综述[J].软件学报,2022,33(4):1287-1314. WEN Y M,LIU S,MIU Y Q,et al.Survey on semi-supervised classification of data streams with concept drifts[J].Journal of Software,2022,33(4):1287-1314. [5] DITZLER G,ROVERI M,ALIPPI C,et al.Learning in nonstationary environments:a survey[J].IEEE Computational Intelligence Magazine,2015,10(4):12-25. [6] BRZEZINSKI D,STEFANOWSKI J.Reacting to different types of concept drift:the accuracy updated ensemble algorithm[J].IEEE Transactions on Neural Networks and Learning Systems,2013,25(1):81-94. [7] 徐清妍,何丽,朱泓西.改进Hoeffding不等式的概念漂移检测方法[J].计算机工程与应用,2020,56(19):55-61. XU Q Y,HE L,ZHU H X.Improved detection method of concept drift based on the hoeffding inequality[J].Computer Engineering and Applications,2020,56(19):55-61. [8] 潘吴斌,程光,郭晓军,等.基于信息熵的自适应网络流概念漂移分类方法[J].计算机学报,2017,40(7):1556-1571. PAN W B,CHENG G,GUO X J,et al.An adaptive classification approach based on information entropy for network traffic in presence of concept drift[J].Chinese Journal of Computers,2017,40(7):1556-1571. [9] WOOLAM C,MASUD M M,KHAN L.Lacking labels in the stream:classifying evolving stream data with few labels[C]//International Symposium on Methodologies for Intelligent Systems.Berlin,Heidelberg:Springer,2009:552-562. [10] MASUD M M,WOOLAM C,GAO J,et al.Facing the reality of data stream classification:coping with scarcity of labeled data[J].Knowledge and Information Systems,2012,33(1):213-244. [11] BREVE F,ZHAO L.Semi-supervised learning with concept drift using particle dynamics applied to network intrusion detection data[C]//2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence,2013:335-340. [12] BERTINI J R,LOPES A A,ZHAO L.Partially labeled data stream classification with the semi-supervised K-associated graph[J].Journal of the Brazilian Computer Society,2012,18(4):299-310. [13] LI P,WU X,HU X.Mining recurring concept drifts with limited labeled streaming data[C]//Proceedings of 2nd Asian Conference on Machine Learning,2010:241-252. [14] DIN S U,SHAO J,KUMAR J,et al.Online reliable semi-supervised learning on evolving data streams[J].Information Sciences,2020,525:153-171. [15] ?LIOBAIT? I,BIFET A,PFAHRINGER B,et al.Active learning with drifting streaming data[J].IEEE Transactions on Neural Networks and Learning Systems,2013,25(1):27-39. [16] 刘子昂,蒋雪,伍冬睿,等.基于池的无监督线性回归主动学习[J].自动化学报,2021,47(12):2771-2783. LIU Z A,JIANG X,WU D R,et al.Unsupervised pool-based active learning for linear regression[J].Acta Automatica Sinica,2021,47(12):2771-2783. [17] 李艳红,任霖,王素格,等.非平衡数据流在线主动学习方法[J/OL].自动化学报:1-13[2022-09-21].http://kns.cnki.net/kcms/detail/11.2109.TP.20220608.0946.005.html. LI Y H,REN L,WANG S G,et al.Online active learning method for imbalanced data stream[J/OL].Acta Automatica Sinica:1-13[2022-09-21].http://kns.cnki.net/kcms/detail/11.2109.TP.20220608.0946.005.html. [18] GAMA J,?LIOBAIT? I,BIFET A,et al.A survey on concept drift adaptation[J].ACM Computing Surveys(CSUR),2014,46(4):1-37. [19] BIFET A,GAVALDA R.Learning from time-changing data with adaptive windowing[C]//Proceedings of the 2007 SIAM International Conference on Data Mining,2007:443-448. [20] NISHIDA K,YAMAUCHI K.Detecting concept drift using statistical testing[C]//International Conference on Discovery Science.Berlin,Heidelberg:Springer,2007:264-269. [21] STREET W N,KIM Y S.A streaming ensemble algorithm(SEA) for large-scale classification[C]//Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2001:377-382. [22] WANG H,FAN W,YU P S,et al.Mining concept-drifting data streams using ensemble classifiers[C]//Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2003:226-235. [23] WIDYANTORO D H,YEN J.Relevant data expansion for learning concept drift from sparsely labeled data[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(3):401-412. [24] HOSSEINI M J,GHOLIPOUR A,BEIGY H.An ensemble of cluster-based classifiers for semi-supervised classification of non-stationary data streams[J].Knowledge and Information Systems,2016,46(3):567-597. [25] CASALINO G,CASTELLANO G,MENCAR C.Incremental adaptive semi-supervised fuzzy clustering for data stream classification[C]//2018 IEEE Conference on Evolving and Adaptive Intelligent Systems(EAIS),2018:1-7. [26] ZHENG X,LI P,HU X,et al.Semi-supervised classification on data streams with recurring concept drift and concept evolution[J].Knowledge-Based Systems,2021,215:106749. [27] 李南.基于聚类假设的数据流分类算法[J].模式识别与人工智能,2017,30(1):1-10. LI N.Clustering assumption based classification algorithm for stream data[J].Pattern Recognition and Artificial Intelligence,2017,30(1):1-10. [28] HAQUE A,KHAN L,BARON M,et al.Efficient handling of concept drift and concept evolution over stream data[C]//2016 IEEE 32nd International Conference on Data Engineering(ICDE),2016:481-492. [29] IENCO D,BIFET A,?LIOBAIT? I,et al.Clustering based active learning for evolving data streams[C]//International Conference on Discovery Science.Berlin,Heidelberg:Springer,2013:79-93. [30] ZGRAJA J,GAMA J,WO?NIAK M.Active learning by clustering for drifted data stream classification[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer,2018:80-90. [31] LU Y,CHEUNG Y M,TANG Y Y.Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift[J].IEEE Transactions on Neural Networks and Learning Systems,2019,31(8):2764-2778. [32] OZA N C,RUSSELL S.Experimental comparisons of online and batch versions of bagging and boosting[C]//Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2001:359-364. [33] DITZLER G,POLIKAR R.Incremental learning of concept drift from streaming imbalanced data[J].IEEE Transactions on Knowledge and Data Engineering,2012,25(10):2283-2301. [34] BIFET A,HOLMES G,PFAHRINGER B,et al.Moa:massive online analysis,a framework for stream classification and clustering[C]//Proceedings of the First Workshop on Applications of Pattern Analysis,2010:44-50. [35] BRZEZINSKI D,STEFANOWSKI J.Combining block-based and online methods in learning ensembles from concept drifting data streams[J].Information Sciences,2014,265:50-67. [36] ELWELL R,POLIKAR R.Incremental learning of concept drift in nonstationary environments[J].IEEE Transactions on Neural Networks,2011,22(10):1517-1531. [37] KOLTER J Z,MALOOF M A.Dynamic weighted majority:an ensemble method for drifting concepts[J].The Journal of Machine Learning Research,2007,8:2755-2790. [38] KHEZRI S,TANHA J,AHMADI A,et al.A novel semi-supervised ensemble algorithm using a performance-based selection metric to non-stationary data streams[J].Neurocomputing,2021,442:125-145. [39] LIU W,ZHANG H,DING Z,et al.A comprehensive active learning method for multiclass imbalanced data streams with concept drift[J].Knowledge-Based Systems,2021,215:106778. |
[1] | JIANG Hongxun, JIANG Junyi, LIANG Xun. Survey on Credit Card Transaction Fraud Detection Based on Machine Learning [J]. Computer Engineering and Applications, 2023, 59(21): 1-25. |
[2] | WANG Yu, WANG Xin, ZHANG Shujuan, ZHENG Guoqiang, ZHAO Long, ZHENG Gaofeng. Research on Efficient Knowledge Fusion Method for Heterogeneous Big Data Environments [J]. Computer Engineering and Applications, 2022, 58(6): 142-148. |
[3] | WANG Junhong, GUO Yahui. Imbalanced Data Stream Classification Algorithm for Dynamic Data Chunk [J]. Computer Engineering and Applications, 2021, 57(13): 124-129. |
[4] | ZHANG Hainan, YOU Xiaoming, LIU Sheng, LIU Zhongqiang. Interactive Learning Cuckoo Search Algorithm [J]. Computer Engineering and Applications, 2020, 56(7): 147-154. |
[5] | XU Qingyan, HE Li, ZHU Hongxi. Improved Detection Method of Concept Drift Based on the Hoeffding Inequality [J]. Computer Engineering and Applications, 2020, 56(19): 55-61. |
[6] | HU Yang, HU Xuegang, LI Peipei. Fast Short Text Data Stream Classification Method Based on Spark [J]. Computer Engineering and Applications, 2020, 56(14): 138-147. |
[7] | ZHAO Xiaoyong, WANG Ningning, WANG Lei. Research of Outlier Ensemble Mining Based on Active Learning [J]. Computer Engineering and Applications, 2020, 56(12): 112-117. |
[8] | MA Jianhong, ZHANG Bingfei, ZHANG Shaoguang, LIU Shuangyao. Named Entity Recognition for New Energy Vehicles Based on Active MCNN-SCRF [J]. Computer Engineering and Applications, 2019, 55(7): 23-29. |
[9] | JIANG Zhendong1, WANG Jianming1, PAN Wubin2. Adaptive Traffic Classification Approach Based on Concept Drift Detection [J]. Computer Engineering and Applications, 2019, 55(3): 68-75. |
[10] | YANG Chengwen, LI Jiming, YANG Dongyong. Active Learning for Hyperspectral Image Classification with Deep Bayesian [J]. Computer Engineering and Applications, 2019, 55(18): 166-172. |
[11] | ZHAO Yue, LI Yaoqiang, XU Xiaona, WU Licheng. Near-optimal active learning for Tibetan speech recognition [J]. Computer Engineering and Applications, 2018, 54(22): 156-159. |
[12] | YAO Qiong1, XU Xiang1,2, ZOU Kun1. 3D Gabor based multi-view active learning for hyperspectral image classification [J]. Computer Engineering and Applications, 2018, 54(22): 197-204. |
[13] | CHEN Juan1, ZHU Fuxi1,2. Time series classification based on PU problem with semi-supervised learning and active learning [J]. Computer Engineering and Applications, 2018, 54(11): 116-121. |
[14] | HAN Chong1, YUAN Yingshan2, MEI Tao2, GENG Huiling2. Data stream outlier detection algorithm based on K-means [J]. Computer Engineering and Applications, 2017, 53(3): 58-63. |
[15] | ZHAO Pengfei, ZHOU Shaoguang, YI Yang, HU Yiqun. Classification method of hyperspectral remote sensing image based on SLIC and active learning [J]. Computer Engineering and Applications, 2017, 53(3): 183-187. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||