Supervised Ranking for Plagiarism Source Retrieval

Kyle Williams; Hung-Hsuan Chen; C. Lee Giles

Supervised Ranking for Plagiarism Source Retrieval

Kyle Williams ,
Hung-Hsuan Chen ,
C. Lee Giles

CLEF 2014 Evaluation Labs and Workshop Working Notes Papers | January 2014

Download BibTex

Source retrieval involves making use of a search engine to retrieve candidate sources of plagiarism for a given suspicious document so that more accurate comparisons can be made. We describe a strategy for source retrieval that uses a supervised method to classify and rank search engine results as potential sources of plagiarism without retrieving the documents themselves. Evaluation shows the performance of our approach, which achieved the highest precision (0.57) and F1 score (0.47) in the 2014 PAN Source Retrieval task.