WikiWars-UA: Ukrainian corpus annotated with temporal expressions
Files
Date
2019-04-18
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Lviv Politechnic Publishing House
Abstract
Reliability of tools and reproducibility of study results are important features of modern Natural Language Processing (NLP) tools and methods. The scientific research is indeed increasingly coming under criticism for the lack of reproducibility of results. First step towards the reproducibility is related to the availability of freely usable tools and corpora. In our work, we are interested in automatic processing of unstructured documents for the extraction of temporal information. Our main objective is to create reference annotated corpus with temporal information related to dates (absolute and relative), periods, time, etc. in Ukrainian, and to their normalization. The approach relies on the adaptation of existing application, automatic pre-annotation of WikiWars corpus in Ukrainian and its manual correction. The reference corpus permits to reliably evaluate the current version of the automatic temporal annotator and to prepare future work on these topics.
Description
Keywords
Temporality, Information Extraction, Ukrainian, WikiWars, HeidelTime, Reference Corpus
Citation
Grabar N. WikiWars-UA: Ukrainian corpus annotated with temporal expressions / Natalia Grabar, Thierry Hamon // Computational Linguistics and Intelligent Systems. — Lviv : Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 22–31. — (Paper presentations).