A(n) Assumption in machine learning

dc.citation.epage38
dc.citation.journalTitleComputational Linguistics and Intelligent Systems
dc.citation.spage32
dc.citation.volume2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019
dc.contributor.affiliationTaras Shevchenko National University of Kyiv
dc.contributor.authorKlyushin, Dmitry
dc.contributor.authorLyashko, Sergey
dc.contributor.authorZub, Stanislav
dc.coverage.placenameLviv
dc.date.accessioned2019-10-31T13:21:05Z
dc.date.available2019-10-31T13:21:05Z
dc.date.created2019-04-18
dc.date.issued2019-04-18
dc.description.abstractThe commonly used statistical tools in machine learning are two-sample tests for verifying hypotheses on homogeneity, for example, for estimation of corpushomogeneity, testing text authorship and so on. Often, they are effective only for sufficiently large sample (n> 100) and have limited application in situations where the size of samples is small (n < 30). To solve the problem for small samples, methods of reproducing samples are often used: jackknife and bootstrap. We propose and investigate a family of homogeneity measures based on A(n) assumption that are effective both for small and large samples.
dc.format.extent32-38
dc.format.pages7
dc.identifier.citationKlyushin D. A(n) Assumption in machine learning / Dmitry Klyushin, Sergey Lyashko, Stanislav Zub // Computational Linguistics and Intelligent Systems. — Lviv : Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 32–38. — (Paper presentations).
dc.identifier.citationenKlyushin D. A(n) Assumption in machine learning / Dmitry Klyushin, Sergey Lyashko, Stanislav Zub // Computational Linguistics and Intelligent Systems. — Lviv Politechnic Publishing House, 2019. — Vol 2 : Proceedings of the 3nd International conference, COLINS 2019. Workshop, Kharkiv, Ukraine, April 18-19, 2019. — P. 32–38. — (Paper presentations).
dc.identifier.issn2523-4013
dc.identifier.urihttps://ena.lpnu.ua/handle/ntb/45493
dc.language.isoen
dc.publisherLviv Politechnic Publishing House
dc.relation.ispartofComputational Linguistics and Intelligent Systems (2), 2019
dc.relation.referencesen1. Granichin, O., Kizhaeva, N., Shalymov, D., Volkovich, Z.: Writing style determination using the KNNtext model. In: Proceedings of the 2015 IEEE International Symposium on Intelligent Control, pp. 900–905. IEEE, Sydney (2015).
dc.relation.referencesen2. Zenkov, A., Sazanova, L.: A New Stylometry Method Basing on the Numerals Statistic. International Journal of Data Science and Technology 3(2), 16-23 (2017).
dc.relation.referencesen3. Kilgariff,A.: Comparing corpora. International Journal of Corpus Linguistics 6(1):97–133 (2001).
dc.relation.referencesen4. Kilgariff, A.: Language is never, ever, ever, random.Corpus Linguistics and Linguistic Theory, 1(2): 263–276(2005).
dc.relation.referencesen5. Eder, M., Piasecki, M.,Walkowiak, T.: An open stylometric system based on multilevel text analysis. Cognitive Studies | Études cognitives, 17 (2017).
dc.relation.referencesen6. Eder, M., Rybicki, J., Kestemont, M.: Stylometry with R: a package for computational text analysis. R Journal 8(1): 107–121(2016).
dc.relation.referencesen7. Hill, B.: Posterior distribution of percentiles: Bayes’ theorem for sampling from a population. Journal of American Statistical Association 63(322): 677691 (1968).
dc.relation.referencesen8. Klyushin, D., Petunin, Yu.: A Nonparametric Test for the Equivalence of Populations Based on a Measure of Proximity of Samples. Ukrainian Mathematical Journal, 55 (2): 181-198(2003).
dc.relation.referencesen9. Pires A.: Confidence intervals for a binomial proportion: comparison of methods and software evaluation. In: Klinke, S., Ahrend, P., Richter, L. (eds.) Proceedings of the Conference CompStat 2002, Short Communications and Posters(2002).
dc.rights.holder© 2019 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors.
dc.subjectmachine learning
dc.subjectsample homogeneity
dc.subjectconfidence interval
dc.subjectorder statistics
dc.subjectvariational series
dc.titleA(n) Assumption in machine learning
dc.typeArticle

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
2019v2___Proceedings_of_the_3nd_International_conference_COLINS_2019_Workshop_Kharkiv_Ukraine_April_18-19_2019_Klyushin_D-A_n_Assumption_in_machine_32-38.pdf
Size:
1.22 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
2019v2___Proceedings_of_the_3nd_International_conference_COLINS_2019_Workshop_Kharkiv_Ukraine_April_18-19_2019_Klyushin_D-A_n_Assumption_in_machine_32-38__COVER.png
Size:
270.68 KB
Format:
Portable Network Graphics
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.97 KB
Format:
Plain Text
Description: