A(n) Assumption in machine learning
Files
Date
2019-04-18
Journal Title
Journal ISSN
Volume Title
Publisher
Lviv Politechnic Publishing House
Abstract
The commonly used statistical tools in machine learning are two-sample tests for verifying hypotheses on homogeneity, for example, for estimation of corpushomogeneity, testing text authorship and so on. Often, they are effective only for sufficiently large sample (n> 100) and have limited application in situations where the size of samples is small (n < 30). To solve the problem for small samples, methods of reproducing samples are often used: jackknife and bootstrap. We propose and investigate a family of homogeneity measures based on A(n) assumption that are effective both for small and large samples.
Description
Keywords
machine learning, sample homogeneity, confidence interval, order statistics, variational series
Citation
Klyushin D. A(n) Assumption in machine learning / Dmitry Klyushin, Sergey Lyashko, Stanislav Zub // Computational Linguistics and Intelligent Systems. — Lviv : Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 32–38. — (Paper presentations).