计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (13): 247-251.

• 工程与应用 • 上一篇    下一篇

气象落区文本自动生成研究

吴焕萍1,吕终亮2,张华平3,罗  兵2,高  健3,李笑侃3,何国豪2,王永超4   

  1. 1.国家气候中心,北京 100081
    2.国家气象中心,北京 100081
    3.北京理工大学,北京 100081
    4.中国地质大学,北京 100083
  • 出版日期:2014-07-01 发布日期:2015-05-12

Text generation on weather falling area description

WU Huanping1, LV Zhongliang2, ZHANG Huaping3, LUO Bing2, GAO Jian3, LI Xiaokang3, HE Guohao2, WANG Yongchao4   

  1. 1.National Climate Center, Beijing 100081, China
    2.National Meteorological Center, Beijing 10081, China
    3.Beijing Institute of Technology, Beijing 100081, China
    4.China Universities of Geosciences, Beijing 100083, China
  • Online:2014-07-01 Published:2015-05-12

摘要: 面向天气预报和气象服务的文本内容的计算机自动或者半自动生成方法,对文本生成质量要求较高,即要准确、高效、合理,还需要符合自然语言表达,存在较多技术问题。在深入分析中央气象台每日发布的“天气公报”文本内容的基础上,结合地理信息科学和自然语言处理科学方法提出了面向气象落区文本语言生成的基本原理与流程,重点从历史文本内容分析与特征提取、地理区域划分、气象要素空间分析、文本组织与生成等关键技术问题进行了深入讨论,并给出了相应的技术实现。计算机自动生成结果与预报员人工撰写的文本内容对比分析也较好地证明了面向特地领域的文本生成方法具有较好的应用前景。

关键词: 自然语言处理, 文本特征提取, 气象数据空间分析, 文本自动生成

Abstract: The text content needs the high quality characteristic of precision, efficiency, logicalness and as well as natural language expression in order to generate text description automatically for weather forecast and service fields, however currently it lacks more mature technique ways to solve all problems. Based on the analysis of the content of “weather public report” documents, it proposes a framework and its’processing of text generation on weather falling area description by introducing of Geographic Information Science(GIS) and Natural Language Processing(NLP) science. Feather more, it focuses on the four key issues for further discussion, such as information extraction and conceptual model building from massive history documents, geospatial area partitioning, meteorological elements spatial-temporal reasoning, text organization and generation and post-processing. Meanwhile, it proposes the technique implementation in detail for these issues. Compared with the content between forecaster manuscript and computer text, the results show the general text generation method has a potential application prospect in given meteorology forecast and services field.

Key words: Natural Language Processing(NLP), text feature extraction, meteorological data spatial analysis, text auto-
generation