抑郁症网络社交与疑似抑郁微博初步筛选算法

doi:10.3778/j.issn.1002-8331.2007-0186

摘要/Abstract

摘要： 在社交网络数据与抑郁症有关研究中往往需要采取人工方式标注抑郁症和非抑郁症用户，费时费力。通过高校大学生的微博社交数据的采集与分析，研究并提出了一种基于抑郁关键词与语义扩展的大学生疑似抑郁微博初步筛选算法——综合词法。该方法通过基础关键词表的构建和基于词嵌入学习模型WORD2VEC的语义扩展形成抑郁关键词表，最后利用该词表对被测微博进行语义相似度计算，进而识别其是否为疑似抑郁微博。在首都高校大学生微博数据集上的实验结果表明：综合词法在筛选准确率上优于SDS问卷分词法和专家词法；综合词法能够快速地从海量大学生微博中自动筛选占比非常少的疑似抑郁微博，减少专家标注工作量，提高标注效率，并可进一步为后续抑郁症患者精确识别（分类问题）提供良好的数据处理基础。

关键词: 抑郁症, 社交媒体, 话题模型, 社交行为分析, 微博识别

Abstract: In the research of social network data and depression, it is often necessary to label depression and non depression users manually, which is time-consuming and laborious. Through the collection and analysis of college students’ Weibo social data, this paper studies and proposes a suspicious depression Weibo preliminary screening algorithm based on depression keywords and semantic expansion—the comprehensive depression keyword method. The method forms a comprehensive depression keyword table based on the construction of the basic keyword table and the semantic expansion based on the word embedded learning model WORD2VEC. Finally, the vocabulary is used to calculate the semantic similarity of the measured Weibo, and then uses the similarity to determine whether the Weibo is a suspicious Weibo. The experimental results on the Weibo dataset of college students in the capital show that the comprehensive depression keyword method is superior to the SDS questionnaire segmentation method and the expert keyword method in recognition accuracy. The comprehensive depression keyword method can quickly screen the suspected depression Weibo which accounts for a very small proportion from a large number of college students’ Weibo. The method reduces the workload of expert tagging and improves the tagging efficiency, and further provides a good data processing foundation for the accurate identification （classification problem） of subsequent patients with depression.

Key words: depression, social media, topic model, social behavior analysis, Weibo recognition

查国清, 胡超然, 孙铭涛, 王德庆. 抑郁症网络社交与疑似抑郁微博初步筛选算法[J]. 计算机工程与应用, 2022, 58(1): 158-164.

ZHA Guoqing, HU Chaoran, SUN Mingtao, WANG Deqing. Depression Group’s Internet Social Interactionand Preliminary Screening Algorithm for Weibo with Suspected Depression[J]. Computer Engineering and Applications, 2022, 58(1): 158-164.

参考文献

[1] 付菁文，林凡凯，乔瑾渊，等.抑郁症发生的病理生理研究进展[J].生命科学仪器，2015（1）：12-16.
FU J W，LIN F K，QIAO J Y，et al.Progress in the pathophysiology of depression[J].Life Science Instruments，2015（1）：12-16.
[2] 胡大一，刘春萍.焦虑抑郁障碍与心血管疾病[J].中国医刊，2006，41（3）：53-54.
HU D Y，LIU C P.Anxiety and depressive disorders and cardiovascular diseases[J].Chinese Medical Journal，2006，41（3）：53-54.
[3] 董琳，郭晋蜀.高校学生抑郁症高发率的社会影响因素[J].武汉理工大学学报（社会科学版），2008（6）：201.
DONG L，GUO J S.Social factors influencing the high incidence of depression among college students[J].Journal of Wuhan University of Technology（Social Science Edition），2008（6）：201.
[4] 刘芳宜，朱丽明，方秀才，等.三种不同心理测评量表对功能性消化不良患者焦虑、抑郁状态的评估[J].胃肠病学，2012，17（2）：106-109.
LIU F Y，ZHU L M，FANG X C，et al.Evaluation of anxiety and depression in patients with functional dyspepsia with three different psychological assessment scales[J].Gastroenterology，2012，17（2）：106-109.
[5] 何碧如，何坚茹，叶柏霜.大学生使用微博状况调查及影响分析[J].理论观察，2012（2）：164-165.
HE B R，HE J R，YE B S.Investigation and impact analysis of the use of Weibo by college students[J].Theoretical Observation，2012（2）：164-165.
[6] KOTIKALAPUDI R，CHELLAPPAN S.Associating internet usage with depressive behavior among college students[J].IEEE Technology and Cociety Magazine，2012，31（4）：73-80.
[7] GAMON M，CHOUDHURY M D，COUNTS S，et al.Predicting depression via social media[C]//Proceedings of AAAI 2013，2013.
[8] MORENO M A，JELENCHICK L A，EGAN K G，et al.Feeling bad on Facebook：Depression disclosures by college students on a social networking site[J].Depression and Anxiety，2011，28（6）：447.
[9] MORENO M A，JELENCHICK L A，KOTA R.Exploring depression symptom regerences on Facebook among college freshmen：A mixed methods approach[J].Open Journal of Depression，2013，2（3）：35-41.
[10] DE CHOUDHURY M，GAMON M，COUNTS S，et al.Predicting depression via social media[C]//Proceedings of the Association for the advancement of Artificail Intelligence，2013：1-10.
[11] DE CHOUDHURY M，COUNTS S，HORVITZ E.Social media as a measurement tool of depression in populations[C]//Proceedings of the ACM Web Science Conference，2013：47-56.
[12] HIRAGA M.Predicting depression for Japanese blog text[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics，Student Research Workshop，2017：107-113.
[13] LI J，REN F.Emotion recognition from blog articles[C]//Proceedings of the International Conference on Natural Language Proceeding and Knowledge Engineering，2008：1-8.
[14] GILL A，FRENCH R，GERGLE D，et al.Identifying emotional characteristics from short blog texts[C]//Proceedings of 30th Annual Conference of the Cognitive Science Society，2008：2237-2242.
[15] WANG Y.Depression[EB/OL].[2020-06-15].https：//www.baikemy.com/disease/detail/867/1.
[16] JIAOJIAOLOU.Jieba：High-frequency word extraction[EB/OL].[2020?06?15].https：//blog.csdn.net/jiaojiaolou/article/details/88722715.2019-03-23/2019-03-36.
[17] 张信勇.LIWC：一种基于词语计量的文本分析工具[J].西南民族大学学报（人文社会科学版），2015（4）：101-104.
ZHANG X Y.LIWC：A text analysis tool based on word measurement[J].Journal of Southwest University for Nationalities（Humanities and Social Sciences Edition），2015（4）：101-104.
[18] FAN J N.Introduction to LDA topic model[EB/OL].[2020-06-15].https：//cosx.org/2010/10/lda_topic_model，2010-10-08/2019-03-30.
[19] CWS_CHEN.LDA algorithm（topic model algorithm）[EB/OL].[2020-06-15].https：//blog.csdn.net/SecondLieu.
[20] 董豪.深度学习：一起玩转TensorLayer[M].北京：电子工业出版社，2018.
DONG H.Deep learning：Let’s play with TensorLayer[M].Beijing：Publishing House of Electronics Industry，2018.