Study on model for plagiarism-detection of scientific papers based on sentence similarity

Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (24): 199-201.

• 图形、图像、模式识别 • Previous Articles Next Articles

Study on model for plagiarism-detection of scientific papers based on sentence similarity

LENG Qiangkui1，QIN Yuping1，WANG Chunli2

1.College of Information Science and Engineering，Bohai University，Jinzhou，Liaoning 121000，China
2.College of Information Science and Technology，Dalian Maritime University，Dalian，Liaoning 116026，China

Received:1900-01-01 Revised:1900-01-01 Online:2011-08-21 Published:2011-08-21

基于句子相似度的论文抄袭检测模型研究

冷强奎1，秦玉平1，王春立2

1.渤海大学信息科学与工程学院，辽宁锦州 121000
2.大连海事大学信息科学技术学院，辽宁大连 116026

Abstract

Abstract: A new model for plagiarism-identification of scientific papers based on sentence similarity is presented.Large-scale texts are quickly detected with Local Word-Frequency Fingerprint（LWFF） to find suspected plagiarism ones.Sentence similarity is computed according to the Longest Sorted Common Subsequence（LSCS） between source texts and destination texts.The algorithm can mark plagiarism details，and show evidence.The identification experiments on the SOGOU-T database are done with this model.The results show it has higher information mining capacity，and partly overcomes the shortage of lower precision on existing plagiarism-identification of scientific papers.

Key words: sentence similarity, plagiarism-detection, local word-frequency, Longest Sorted Common Subsequence（LSCS）

摘要： 提出一种基于句子相似度的论文抄袭检测模型。利用局部词频指纹算法对大规模文档进行快速检测，找出疑似抄袭文档。根据最长有序公共子序列算法计算句子间的相似度，并标注抄袭细节，给出抄袭依据。在标准中文数据集SOGOU-T上进行的实验表明，该模型具有较强的局部信息挖掘能力，在一定程度上克服了现有的论文抄袭检测算法精度不高的缺点。

关键词: 句子相似度, 抄袭检测, 局部词频, 最长有序公共子序列

LENG Qiangkui1，QIN Yuping1，WANG Chunli2. Study on model for plagiarism-detection of scientific papers based on sentence similarity[J]. Computer Engineering and Applications, 2011, 47(24): 199-201.

冷强奎1，秦玉平1，王春立2. 基于句子相似度的论文抄袭检测模型研究[J]. 计算机工程与应用, 2011, 47(24): 199-201.

[1]	YANG Yanjiao, ZHAO Guotao, WANG Pidong. Sentence Similarity Calculation Method Based on Semantics and Emotion [J]. Computer Engineering and Applications, 2021, 57(16): 151-158.
[2]	JI Mingyu, WANG Chenlong, AN Xiang, MU Weiye. Method of Sentence Similarity Calculation for Intelligent Customer Service [J]. Computer Engineering and Applications, 2019, 55(13): 123-128.
[3]	WANG Liyue, YE Dongyi. Research and implementation of automatic question-answer system in game customer service scenarios [J]. Computer Engineering and Applications, 2016, 52(17): 152-159.
[4]	WU Zuoyan, WANG Yu. New measure of sentences similarity based on hierarchical network of concepts theory and dependency parsing [J]. Computer Engineering and Applications, 2014, 50(3): 97-102.
[5]	YIN Yaoming, ZHANG Dongzhan. Sentence similarity computing based on relation vector model [J]. Computer Engineering and Applications, 2014, 50(2): 198-203.
[6]	TIAN Weidong，ZU Yongliang. Answer extraction scheme based on answer pattern and semantic feature fusion [J]. Computer Engineering and Applications, 2011, 47(13): 127-130.
[7]	ZHANG Pei-ying. Model for sentence similarity computing based on multi-features combination [J]. Computer Engineering and Applications, 2010, 46(26): 136-137.
[8]	LI Lin，ZHOU Yi-min. Sentence similarity measurement based on information category it contains [J]. Computer Engineering and Applications, 2009, 45(31): 15-17.
[9]	TIAN Sheng-wei¹，Turgun Ibrahim¹，YU Long²，Mahmut Muhammad¹，Hasan Uma¹. Similarity measure algorithm of Uyhur sentence [J]. Computer Engineering and Applications, 2009, 45(26): 144-146.
[10]	ZHOU Fa-guo,YANG Bing-ru. New method for sentence similarity computing and its application in question answering system [J]. Computer Engineering and Applications, 2008, 44(1): 165-167.
[11]	Ye Zheng Hongfei Lin Yang Zhihao. Chinese FAQ System Based on Sentence Similarity [J]. Computer Engineering and Applications, 2007, 43(9期): 161-163.

Study on model for plagiarism-detection of scientific papers based on sentence similarity

基于句子相似度的论文抄袭检测模型研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 11

Recommended Articles

Metrics