Unsupervised acquisition of morphological resources for Ukrainian
Loading...
Date
2017
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
National Technical University «KhPI»
Abstract
Availability of morphological resources is an important and recurrent need because they allow the development of NLP tools and applications for a given language. Indeed, such resources provide basic information which is necessary for such tools for performing more sophisticated treatments (information retrieval, morphosyntactic tagging, etc). We propose to acquire morphological resources for Ukrainian language. The method proposed exploits corpora in order to extract words that are related morphologically between them. The method has two versions: without and with processing of prefixes. The association strength between these words indicates their probability to have a morphological and semantic relation between them. We use three corpora (literary, medical and general-language) and evaluate the results obtained.
According to the corpora, precision varies between 67% and 86%. The results from different corpora are also compared, which shows that there is little redundancy between the corpora. The currently available resource contains 3,315 fully validated pairs of words.
Description
Keywords
Citation
Hamon T. Unsupervised acquisition of morphological resources for Ukrainian / Thierry Hamon, Natalia Grabar // Computational linguistics andintelligent systems (COLINS 2017) : proceedings of the 1st International conference, Kharkiv, Ukraine, 21 April 2017 / National Technical University «KhPI», Lviv Polytechnic National University. – Kharkiv, 2017. – P. 20–30. – Bibliography: 36 titles.