Computational linguistics and intelligent systems. – 2019 р.

Permanent URI for this collectionhttps://ena.lpnu.ua/handle/ntb/45481

Періодичне видання за матеріалами конференції

This volume represents the proceedings of the Workshop Conference, with Posters and Demonstrations track, of the 3rd International Conference on Computational Linguistics and Intelligent Systems, held in Kharkiv, Ukraine, in April 2019. It comprises 13 contributed papers that were carefully peer-reviewed and selected from 27 submissions. The volume opens with the abstracts of the keynote talks. The rest of the collection is organized in two parts. Parts II contain the contributions to the Main COLINS Conference tracks, structured in two topical sections: (I) Computational Linguistics; (II) Intelligent Systems.

Computational Linguistics and Intelligent Systems. – Lviv : Lviv Politechnic Publishing House, 2019. – Volume 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18–19, 2019. – 78 p.

Computational Linguistics and Intelligent Systems

Зміст (том 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019)


1
12
14
22
32
39
46
55
57
60
62
66
69
71
74
76

Content (Vol. 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019)


1
12
PAPER PRESENTATIONS
14
22
32
39
STUDENT SECTION
46
55
57
60
62
66
69
71
74
76

Browse

Search Results

Now showing 1 - 10 of 16
  • Thumbnail Image
    Item
    Automated building and analysis of Ukrainian Twitter corpus for toxic text detection
    (Lviv Politechnic Publishing House, 2019-04-18) Bobrovnyk, Kateryna; Taras Shevchenko National University of Kyiv
    Toxic text detection is an emerging area of study in Inter-net linguistics and corpus linguistics. The relevance of the topic can be explained by the lack of Ukrainian social media text corpora that are publicly available. Research involves building of the Ukrainian Twitter corpus by means of scraping; collective annotation of 'toxic/non-toxic' texts; construction of the obscene words dictionary for future feature engineering; and models training for the task of text classi cation (com-paring Logistic Regression, Support Vector Machine, and Deep Neural Network).
  • Thumbnail Image
    Item
    Semantic similarity identification for short text fragments
    (Lviv Politechnic Publishing House, 2019-04-18) Chuiko, Viktoriia; Khairova, Nina; National Technical University «Kharkiv Polytechnic Institute»
    The paper contains review of the existing methods for semantic similarity identification, such as methods based on the distance between concepts and methods based on lexical intersection. We proposed a method for measuring the semantic similarity of short text fragment, i.e. two sentences. Also, we created corpus of mass-media text. It contains articles of Kharkiv news, that were sorted by their source and date. Then we annotated texts. We defined semantic similarity of sentences manually. In this way, we created learning corpus for our future system.
  • Thumbnail Image
    Item
    Intelligence knowledge-based system based on multilingual dictionaries
    (Lviv Politechnic Publishing House, 2019-04-18) Puzik, Oleksii; Kharkiv National University of Radio Electronics
    Intelligence knowledge-based systems are important part of natural language processing researches. Appropriate formal models simplify developing of such systems and open new ways to improve their quality. This work is devoted to developing of intelligence knowledge-based system using model based on algebra of finite predicates. The model also isbased on lexicographical computer system which consists of trilingual and explanatory dictionaries. Algebra of finite predicates is used as formalization tool.Problems of distinguishing semantic entities is investigated during research. Method of resolving homonymy ambiguities is used to extract separate entities, thus allowing formalization of semantic relationships. In result formal model of intelligence knowledge-based system was developed.It was shown way to extend the model for different languages.
  • Thumbnail Image
    Item
    Study of software systems usability used for customers loyalty identification
    (Lviv Politechnic Publishing House, 2019-04-18) Bilova, Mariia; Trehubenko, Oleksandr; National Technical University «Kharkiv Polytechnic Institute»
    On the background of software (SW) increase in quantity and complexity and SW versions change, a friendly interface allows enhancing SW competitiveness, reduction in SW development costs, increase in SW users number and users satisfaction, as well as reduction in costs needed for users training and support. The product using which users achieve the goals set and solve various issues in an efficient way, is deemed to be a user-friendly software product.The purpose of the article is to study existing methods for assessing the application usability and analyzing the features of using the main software usability indicators on the example of software for customer loyalty of 'Infotech' consumer society.
  • Thumbnail Image
    Item
    WikiWars-UA: Ukrainian corpus annotated with temporal expressions
    (Lviv Politechnic Publishing House, 2019-04-18) Grabar, Natalia; Hamon, Thierry; CNRS, Univ. Lille, UMR 81G3 - STL - Savoirs Textes Langage, F-59000 Lille, France; LIMSI, CNRS, Université Paris-Saclay. F-91405 Orsay, France; Université Paris 13. Sorbonne Paris Cité. F-93430 Villetaneuse. France
    Reliability of tools and reproducibility of study results are important features of modern Natural Language Processing (NLP) tools and methods. The scientific research is indeed increasingly coming under criticism for the lack of reproducibility of results. First step towards the reproducibility is related to the availability of freely usable tools and corpora. In our work, we are interested in automatic processing of unstructured documents for the extraction of temporal information. Our main objective is to create reference annotated corpus with temporal information related to dates (absolute and relative), periods, time, etc. in Ukrainian, and to their normalization. The approach relies on the adaptation of existing application, automatic pre-annotation of WikiWars corpus in Ukrainian and its manual correction. The reference corpus permits to reliably evaluate the current version of the automatic temporal annotator and to prepare future work on these topics.
  • Thumbnail Image
    Item
    A(n) Assumption in machine learning
    (Lviv Politechnic Publishing House, 2019-04-18) Klyushin, Dmitry; Lyashko, Sergey; Zub, Stanislav; Taras Shevchenko National University of Kyiv
    The commonly used statistical tools in machine learning are two-sample tests for verifying hypotheses on homogeneity, for example, for estimation of corpushomogeneity, testing text authorship and so on. Often, they are effective only for sufficiently large sample (n> 100) and have limited application in situations where the size of samples is small (n < 30). To solve the problem for small samples, methods of reproducing samples are often used: jackknife and bootstrap. We propose and investigate a family of homogeneity measures based on A(n) assumption that are effective both for small and large samples.
  • Thumbnail Image
    Item
    Зміст до "Computational Linguistics and Intelligent Systems"
    (Lviv Politechnic Publishing House, 2019-04-18)
  • Thumbnail Image
    Item
    Knowledge-based Big Data Cleanup method
    (Lviv Politechnic Publishing House, 2019-04-18) Berko, Andrii; Lviv Polytechnic National University
    Unlike traditional databases, Big Data stored as NoSQL data resources. Therefore such resources are not ready for efficient use in its original form in most cases. It is due to the availability of various kinds of data anomalies. Most of these anomalies are such as data duplication, ambiguity, inaccuracy, contradiction, absence, the incompleteness of data, etc. To eliminate such incorrectness, data source special cleanup procedures are needed. Data cleanup process requires additional information about the composition, content, meaning, and function of this Big Data resource. Using the special knowledge base can provide a resolving of such problem.
  • Thumbnail Image
    Item
    Author index, Reviewers
    (Lviv Politechnic Publishing House, 2019-04-18)
  • Thumbnail Image
    Item
    Extraction of semantic relations from Wikipedia text corpus
    (Lviv Politechnic Publishing House, 2019-04-18) Shanidze, Olexandr; Petrasova, Svitlana; National Technical University "Kharkiv Polytechnic Institute"
    This paper proposes the algorithm for automatic extraction of semantic relations using the rule-based approach. The authors suggest identifying certain verbs (predicates) between a subject and an object of expressions to obtain a sequence of semantic relations in the designed text corpus of Wikipedia articles. The synsets from WordNet are applied to extract semantic relations between concepts and their synonyms from the text corpus.