计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (4): 130-138.DOI: 10.3778/j.issn.1002-8331.2108-0375

• 模式识别与人工智能 • 上一篇    下一篇

面向长短期混合数据的MOOC辍学预测策略研究

杨坤融,熊余,张健,储雯   

  1. 1.重庆邮电大学 通信与信息工程学院,重庆 400065
    2.重庆邮电大学 教育信息化办公室,重庆 400065
  • 出版日期:2023-02-15 发布日期:2023-02-15

Research on MOOC Dropout Prediction Strategy for Long- and Short-Term Mixed Data

YANG Kunrong, XIONG Yu, ZHANG Jian, CHU Wen   

  1. 1.School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
    2.Office of Educational Information, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Online:2023-02-15 Published:2023-02-15

摘要: 针对MOOC中学生行为数据的长短期混合特性,为解决辍学预测中的动态类别不平衡问题,提出一种基于深度学习的辍学预测策略。首先建立以天为时间步长、周为学习周期的新型学生行为时间序列,以捕捉每一时间步长下时间序列数据的短期依赖关系和相邻学习周期之间的长期模式和趋势。然后结合辍学定义的两种不同表达揭示MOOC辍学预测的动态类别不平衡现象。接着引入基于代价敏感的长短期时间序列深度学习模型,以实现对高辍学风险学生的精准预测。最后在KDD Cup 2015数据集上的实验证明,所提策略能够有效帮助MOOC课程教师和教学管理者追踪课程学生在不同时间步长的学习状态,从而动态监控不同学习阶段的辍学行为。

关键词: 大规模开放式在线课程(MOOC), 深度学习, 辍学预测, 时间序列模型, 代价敏感性学习

Abstract: Aiming at the problems of long- and short-term mixed characteristics and dynamic class imbalance of students’ behavior data in MOOC, a dropout prediction strategy based on deep learning is proposed. Firstly, a novel time series of student’ behavior is established with days as time steps and weeks as learning cycles, which can capture the short-term dependence of each time step, and acquire long-term patterns between adjacent learning cycles. Then, two different expressions of the dropout definition are combined to reveal the dynamic class imbalance phenomenon. Subsequently,?the long- and short-term time series model based on cost-sensitive learning is introduced to predict high dropout risk students accurately. Finally, experimental results on KDD Cup 2015 dataset indicate that the proposed strategy can effectively help MOOC teachers and administrators to track the learning status of learners at different time steps, and dynamically monitor dropout behavior at different learning stages.

Key words: massive open online courses(MOOC), deep learning, dropout prediction, time series model, cost-sensitive learning