Method for paraphrase extractionfrom the news text corpus


The paper discusses the process of automatic extraction of paraphrases used in rewriting. The researchers propose the method for extracting paraphrases from English news text corpora. The method is based on both the developed syntactic rules to define phrases and synsets to identify synonymous words in the designed text corpus of BBC news. In order to implement the method, Natural Language Toolkit, Universal Dependencies parser and WordNet are used.



paraphrase extraction, news text corpus, syntactic rules, synsets, Universal Dependencies, WordNet


Manuilov I. Method for paraphrase extractionfrom the news text corpus / Illia Manuilov, Svitlana Petrasova // Computational Linguistics and Intelligent Systems. — Lviv : Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 69–70. — (Student section).