Statistics and auto-retrieving of discourse markers

Abstract

Abstract: Discourse Markers（DMs） are paid more attention in the field of natural language processing recently. The target of this research is to comb DMs top-down based on large-scale corpus. Two genre corpuses are built, each with a scale of 5 million characters. Several pieces of software, such like UltraEdit, are applied to retrieving and calculating. After the use situations are analyzed in detail, it is found that DMs are not used only in oral discourse and each genre has its own use traits. An algorithm is given and realized through C#, and a test shows it is effective.

Key words: computer-assisted, Discourse Markers（DMs）, calculation, filtration

摘要： 语篇中的话语标记在自然语言处理中逐渐得到重视。基于大规模语料库对话语标记进行自顶向下的梳理是该研究的目标。研究中构建了两个500万字次的语体语料库，利用UltraEdit等软件对话语标记进行提取和统计，对使用情况作了详细分析，发现话语标记并非只用于口语之中，每种语体都有自己的使用特色。在获得的话语标记的基础上，给出了在大规模语料库中提取算法并编程实现，减少了人工操作，提高了识别效率。

关键词: 机助, 话语标记, 计量, 过滤

KAN Minggang. Statistics and auto-retrieving of discourse markers[J]. Computer Engineering and Applications, 2012, 48(12): 19-23.

阚明刚. 话语标记的计量与自动过滤提取[J]. 计算机工程与应用, 2012, 48(12): 19-23.

[1]	PENG Yonggang. Design and Verification of Quantum Controlled-Not Gate NMR Pulse Sequence [J]. Computer Engineering and Applications, 2021, 57(18): 97-102.
[2]	LEI Bin, WANG Wanying, ZHAO Jiaxin. Review of Research on Location Allocation Optimization [J]. Computer Engineering and Applications, 2021, 57(1): 48-55.
[3]	YAN Xixi, WANG Pengcheng, LIU Xingyun. One-Dimensional Quantum Convolution Calculation [J]. Computer Engineering and Applications, 2020, 56(8): 55-59.
[4]	WEI Zhanchen, LIU Xiaoyu, HUANG Qiulan, SUN Gongxing. Research on Optimization for Iteration-Intensive Applications on Spark [J]. Computer Engineering and Applications, 2020, 56(23): 68-73.
[5]	ZHANG Fangrong, YANG Qing. Research on Entity Relation Extraction Method in Knowledge-Based Question Answering [J]. Computer Engineering and Applications, 2020, 56(11): 219-224.
[6]	LIU Chenhui, ZHANG Desheng, HU Gang. Research on Chinese Key Phrase Extraction Algorithm Based on TAKE [J]. Computer Engineering and Applications, 2020, 56(10): 115-121.
[7]	ZHANG Jianmin, XU Zhihui, LONG Jiale, CHEN Fujian, LUO Shunqi, LUO Xinchun, LIN Genyuan1, LI Hongbin. Development of Intelligent Grasp and Classification System for Robot Arm with Three-Dimensional Vision [J]. Computer Engineering and Applications, 2019, 55(15): 235-240.
[8]	SHAN Wenbo1, CHEN Boling2, ZHONG Qiuhao1, WANG Jianxin1. Abnormal Nodes Location Method for Telecommunication Carrying Network Based on Terminal Data [J]. Computer Engineering and Applications, 2019, 55(11): 85-92.
[9]	QIN Jinbo1, ZENG Zhiqiang1，2, LIANG Ji1, YANG Mingxiang2, ZHANG Jian1. Review of application GPU technology in hydraulic parallel optimization calculation [J]. Computer Engineering and Applications, 2018, 54(3): 23-29.
[10]	ZHAI Mengdong1, JIANG Xin2, WANG Xiaofeng1. Research on high-throughput routing simulation based on OpenStack [J]. Computer Engineering and Applications, 2018, 54(22): 74-79.
[11]	LAN Xiaoli1, LIU Hongxing1，2, YAO Hanbing1，2. Improved image inpainting algorithm based on texture blocks and gradient feature [J]. Computer Engineering and Applications, 2018, 54(20): 172-177.
[12]	FAN Qingfu, ZHANG Lei, LIU Leijun, BAO Suning, FANG Chen. Online GPS trajectory data compression based on offset calculation [J]. Computer Engineering and Applications, 2017, 53(8): 254-259.
[13]	XIE Linquan, LIANG Boqun. Collaborative filtering recommendation based on user characteristics classification and dynamic time [J]. Computer Engineering and Applications, 2017, 53(6): 80-84.
[14]	LIU Zhongyan1,2, FANG Junlong1, TIAN Shumei1. Application of generalized Morse wavelet in 3D profile measurements of objects [J]. Computer Engineering and Applications, 2017, 53(23): 190-196.
[15]	CAO Jie1，2, ZHAO Xiulong1, WANG Jinhua2. Gesture recognition method based on improved finger tip and Hu moments [J]. Computer Engineering and Applications, 2017, 53(21): 138-143.

Statistics and auto-retrieving of discourse markers

话语标记的计量与自动过滤提取

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics