Feature screening algorithm for high dimensional data

dc.citation.epage711
dc.citation.issue3
dc.citation.journalTitleМатематичне моделювання та комп'ютинг
dc.citation.spage703
dc.contributor.affiliationУніверситет Хасана ІІ Касабланки
dc.contributor.affiliationHassan II University of Casablanca
dc.contributor.authorЧамлал, Х.
dc.contributor.authorБенцмане, А.
dc.contributor.authorУадерман, Т.
dc.contributor.authorChamlal, H.
dc.contributor.authorBenzmane, A.
dc.contributor.authorOuaderhman, T.
dc.coverage.placenameЛьвів
dc.coverage.placenameLviv
dc.date.accessioned2025-03-04T12:17:22Z
dc.date.created2023-02-28
dc.date.issued2023-02-28
dc.description.abstractНа даний час скринінг ознак стає важливою темою в галузі машинного навчання й аналізу багатовимірних даних. Відфільтрування нерелевантних ознак із набору змінних вважається важливим попереднім кроком, який слід виконувати перед будь-яким аналізом даних. Багато дослідників запропонували нові підходи до цієї теми після того, як Фан та Лв (J. Royal Stat. Soc. 70 (5), 849–911 (2008)) ввели властивість надійного скринінгу. Однак продуктивність цих підходів відрізняється від методу до методу. У запропонованій роботі є намагання додати до цього списку новий алгоритм, який виконує скринінг ознак на основі фільтра взаємодії Кендалла (J. Appl. Stat. 50 (7), 1496–1514 (2020)), коли змінна відповідь є неперервною. Добра поведінка нашого алгоритму доводиться за декількома сценаріями моделювання через порівняння з існуючим методом.
dc.description.abstractCurrently, feature screening is becoming an important topic in the fields of machine learning and high-dimensional data analysis. Filtering out irrelevant features from a set of variables is considered to be an important preliminary step that should be performed before any data analysis. Many approaches have been proposed to the same topic after the work of Fan and Lv (J. Royal Stat. Soc., Ser. B. 70 (5), 849–911 (2008)), who introduced the sure screening property. However, the performance of these methods differs from one paper to another. In this work, we aim to add to this list a new algorithm performing feature screening inspired by the Kendall interaction filter (J. Appl. Stat. 50 (7), 1496–1514 (2020)) when the response variable is continuous. The good behavior of our algorithm is proved through a comparison with an existing method, proposed in this work under several simulation scenarios.
dc.format.extent703-711
dc.format.pages9
dc.identifier.citationChamlal H. Feature screening algorithm for high dimensional data / H. Chamlal, A. Benzmane, T. Ouaderhman // Mathematical Modeling and Computing. — Lviv : Lviv Politechnic Publishing House, 2023. — Vol 10. — No 3. — P. 703–711.
dc.identifier.citationenChamlal H. Feature screening algorithm for high dimensional data / H. Chamlal, A. Benzmane, T. Ouaderhman // Mathematical Modeling and Computing. — Lviv : Lviv Politechnic Publishing House, 2023. — Vol 10. — No 3. — P. 703–711.
dc.identifier.doidoi.org/10.23939/mmc2023.03.703
dc.identifier.urihttps://ena.lpnu.ua/handle/ntb/63507
dc.language.isoen
dc.publisherВидавництво Львівської політехніки
dc.publisherLviv Politechnic Publishing House
dc.relation.ispartofМатематичне моделювання та комп'ютинг, 3 (10), 2023
dc.relation.ispartofMathematical Modeling and Computing, 3 (10), 2023
dc.relation.references[1] Mai Q., Zou H. The fused Kolmogorov filter: A nonparametric model-free screening method. The Annals of Statistics. 43 (4), 1471–1497 (2015).
dc.relation.references[2] Fan J., Song R. Sure Independence Screening in Generalized Linear Models With NPDimensionality. The Annals of Statistics. 38 (6), 3567–3604 (2010).
dc.relation.references[3] Huang D., Li R., Wang H. Feature Screening for Ultrahigh Dimensional Categorical Data with Applications. Journal of Business & Economic Statistics. 32 (2), 237–244 (2014).
dc.relation.references[4] Fan Y., Kong Y., Li D., Lv J. Interaction pursuit with feature screening and selection. Preprint arXiv:1605.08933 (2016).
dc.relation.references[5] Fan J., Lv J. Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society, Series B: Statistical Methodology. 70 (5), 849–911 (2008).
dc.relation.references[6] Anzarmou Y., Mkhadri A., Oualkacha K. The Kendall interaction filter for variable interaction screening in ultra high dimensional classification problems. Journal of Applied Statistics. 50 (7), 1496–1514 (2020).
dc.relation.references[7] Reese R., Dai X., Fu G. Strong Sure Screening of Ultra-high Dimensional Data with Interaction Effects. Preprint arXiv:1801.07785 (2018).
dc.relation.references[8] Hao N., Zhang H. H. Interaction Screening for Ultrahigh-Dimensional Data. Journal of the American Statistical Association. 109 (507), 1285–1301 (2014).
dc.relation.references[9] Niu Y. S., Hao N., Zhang H. H. Interaction screening by partial correlation. Statistics and Its Interface. 11 (2), 317–325 (2018).
dc.relation.references[10] Moore J. H. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity. 56 (1–3), 73–82 (2003).
dc.relation.references[11] Cordell H. J. Detecting gene-gene interactions that underlie human diseases. Nature Reviews Genetics. 10 (6), 392–404 (2009).
dc.relation.references[12] Cook R. D., Zhang X. Fused estimators of the central subspace in sufficient dimension reduction. Journal of the American Statistical Association. 109 (506), 815–827 (2014).
dc.relation.referencesen[1] Mai Q., Zou H. The fused Kolmogorov filter: A nonparametric model-free screening method. The Annals of Statistics. 43 (4), 1471–1497 (2015).
dc.relation.referencesen[2] Fan J., Song R. Sure Independence Screening in Generalized Linear Models With NPDimensionality. The Annals of Statistics. 38 (6), 3567–3604 (2010).
dc.relation.referencesen[3] Huang D., Li R., Wang H. Feature Screening for Ultrahigh Dimensional Categorical Data with Applications. Journal of Business & Economic Statistics. 32 (2), 237–244 (2014).
dc.relation.referencesen[4] Fan Y., Kong Y., Li D., Lv J. Interaction pursuit with feature screening and selection. Preprint arXiv:1605.08933 (2016).
dc.relation.referencesen[5] Fan J., Lv J. Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society, Series B: Statistical Methodology. 70 (5), 849–911 (2008).
dc.relation.referencesen[6] Anzarmou Y., Mkhadri A., Oualkacha K. The Kendall interaction filter for variable interaction screening in ultra high dimensional classification problems. Journal of Applied Statistics. 50 (7), 1496–1514 (2020).
dc.relation.referencesen[7] Reese R., Dai X., Fu G. Strong Sure Screening of Ultra-high Dimensional Data with Interaction Effects. Preprint arXiv:1801.07785 (2018).
dc.relation.referencesen[8] Hao N., Zhang H. H. Interaction Screening for Ultrahigh-Dimensional Data. Journal of the American Statistical Association. 109 (507), 1285–1301 (2014).
dc.relation.referencesen[9] Niu Y. S., Hao N., Zhang H. H. Interaction screening by partial correlation. Statistics and Its Interface. 11 (2), 317–325 (2018).
dc.relation.referencesen[10] Moore J. H. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity. 56 (1–3), 73–82 (2003).
dc.relation.referencesen[11] Cordell H. J. Detecting gene-gene interactions that underlie human diseases. Nature Reviews Genetics. 10 (6), 392–404 (2009).
dc.relation.referencesen[12] Cook R. D., Zhang X. Fused estimators of the central subspace in sufficient dimension reduction. Journal of the American Statistical Association. 109 (506), 815–827 (2014).
dc.rights.holder© Національний університет “Львівська політехніка”, 2023
dc.subjectскринінг ознак
dc.subjectдискретизація
dc.subjectбагатовимірні дані
dc.subjectрегресія
dc.subjectfeature screening
dc.subjectdiscretization
dc.subjecthigh dimensional data
dc.subjectregression
dc.titleFeature screening algorithm for high dimensional data
dc.title.alternativeАлгоритм скринінгу ознак для багатовимірних даних
dc.typeArticle

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
2023v10n3_Chamlal_H-Feature_screening_algorithm_703-711.pdf
Size:
1.34 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
2023v10n3_Chamlal_H-Feature_screening_algorithm_703-711__COVER.png
Size:
431.65 KB
Format:
Portable Network Graphics

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.83 KB
Format:
Plain Text
Description: