计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (23): 155-167.DOI: 10.3778/j.issn.1002-8331.2402-0140

• 模式识别与人工智能 • 上一篇    下一篇

基于联邦学习的分布式物联网设备识别方法

邹徐熹,周忠冉,王虹岚,李飞,顾亚林,魏训虎,李静   

  1. 1.南京南瑞信息通信科技有限公司,南京 211106
    2.南京航空航天大学 计算机科学与技术学院/人工智能学院,南京 211106
  • 出版日期:2024-12-01 发布日期:2024-11-29

Distributed IoT Device Identification Based on Federated Learning

ZOU Xuxi, ZHOU Zhongran, WANG Honglan, LI Fei, GU Yalin, WEI Xunhu, LI Jing   

  1. 1.Nanjing Nari Information and Communication Technology Co., Ltd., Nanjing 211106, China
    2.College of Computer Science and Technology/College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
  • Online:2024-12-01 Published:2024-11-29

摘要: 传统物联网设备识别方法通常采用集中式训练方式,将边缘设备的私有流量集中部署在中央服务器中用于学习指纹提取与识别,但集中式训练存在着数据隐私问题以及单点故障问题。针对上述问题,提出了基于联邦学习的分布式物联网设备识别方法。在边缘设备方面,提出了轻量级设备指纹识别模型,提取网络流量会话中的时序信息以及特征间信息以生成可识别的指纹,并训练一个高效的分类器实现指纹识别;在中央服务器方面,设计了基于生成式知识蒸馏的异构联邦学习算法,通过训练变分生成器以无代理数据的方式集成本地信息并利用集成知识指导局部模型,从而解决分布式场景下的统计异构问题。在四个公开的基准数据集上进行了大量实验,通过与先进的联邦学习方法及设备指纹识别方法进行比较与分析,验证了所提方法对于提升分布式物联网设备识别准确率及效率的有效性。

关键词: 物联网设备识别, 设备指纹, 联邦学习, 知识蒸馏

Abstract: Traditional IoT device identification methods usually use centralized training, where private traffic from edge devices is deployed in a central server for learning fingerprint extraction and identification. But centralized training suffers from data privacy issues as well as single-point-of-failure problems. To address these issues, a distributed IoT device identification method based on federated learning is proposed. For edge devices, a lightweight device fingerprint identification model is proposed, which extracts temporal information as well as inter-feature information from network traffic sessions to generate distinguishable fingerprints, and trains an efficient classifier to achieve fingerprint identification. For central servers, a heterogeneous federated learning algorithm based on generative knowledge distillation is designed, which integrates local information in an agentless data manner by training a variation generator and uses the integrated knowledge to guide local models, thus solving the statistical heterogeneity problem in distributed scenarios. Extensive experiments are conducted on four publicly available benchmark datasets, comparing it with state-of-the-art federated learning methods and device fingerprinting methods. It validates the effectiveness of the proposed method for improving distributed IoT device recognition accuracy and efficiency.

Key words: IoT device identification, device fingerprint, federated learning, knowledge distillation