计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (11): 21-27.

• 博士论坛 • 上一篇    下一篇

基于广义话题结构语料库的语体对比研究——以报告体与小说体为例

尚  英,宋  柔   

  1. 北京语言大学 语言信息处理研究所,北京 100083
  • 出版日期:2014-06-01 发布日期:2015-04-08

Comparative study on different styles based on generalized topic structure corpus——case study of report style and novel style

SHANG Ying, SONG Rou   

  1. Centre of Language Information Processing, Beijing Language and Culture University, Beijing 100083, China
  • Online:2014-06-01 Published:2015-04-08

摘要: 传统语体学对于语体的描写多集中于词汇、句式、修辞等方面。近年来学者们开始在语法研究中更加重视语体因素,但是目前的研究多为微观的分析,没有宏观的理论体系支撑,难以探索到语体深层次的问题。广义话题理论根据汉语篇章的特点,以边界明确的标点句为基础,提出了广义话题和话题结构的概念。从广义话题的角度对比了工作报告语体和小说语体的差异,涉及到命名实体话题、状性话题、谓性话题、逻辑话题和关系话题等。并对这种差异的原因做了合理的解释。虽然工作报告与小说在语体上差异明显,但没有人从话题-说明的角度进行过比较,更从未有大规模语料库上的统计分析。该工作丰富了统计语体学的理论,并且为计算机自动分析话题结构、自动评判作文水平、文本按语体分类等应用打下了扎实的基础。

关键词: 语体, 标点句, 广义话题, 话题结构

Abstract: Traditional stylistics mainly concentrates on vocabulary, sentence pattern and rhetoric to describe the style. In recent years, scholars began to attach more importance to the style in grammar research. However, most of the current studies tend to be microanalysis and have no macro-theoretical support, thus it’s hard for them to discover deeper problems of the style. According to the features of Chinese discourse and based on punctuation clause with definite boundaries, generalized topic theory proposes the concepts of general topic and topic structure. The thesis makes a comparative study on the stylistic differences of work reports and novels, involving topic of named entity, adverbial topic, predicate topic, logical topic and relational topic, etc. The thesis also makes a reasonable explanation for these differences. Although work reports and novels differ greatly in style, no one has ever made a comparative study from the perspective of topic-comment, and there is no extensive statistical analysis based on a corpus up to now. The work of this thesis enriches the theory of statistical stylistics, and lays a solid foundation for the application of computer auto-analysis of topic structure, auto-judgment of composition level, and textual classification by style, etc.

Key words: style, punctuation clause, generalized topic, discourse structure