Semantic similarity identification for short text fragments

Abstract

The paper contains review of the existing methods for semantic similarity identification, such as methods based on the distance between concepts and methods based on lexical intersection. We proposed a method for measuring the semantic similarity of short text fragment, i.e. two sentences. Also, we created corpus of mass-media text. It contains articles of Kharkiv news, that were sorted by their source and date. Then we annotated texts. We defined semantic similarity of sentences manually. In this way, we created learning corpus for our future system.

Description

Keywords

semantic similarity, short text fragments, corpus of mass-media text, automatic identification

Citation

Chuiko V. Semantic similarity identification for short text fragments / Viktoriia Chuiko, Nina Khairova // Computational Linguistics and Intelligent Systems. — Lviv : Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 57–59. — (Student section).