Comprehensive analysis of few-shot image classification method using triplet loss

Баранов, Микола; Щербина, Юрій; Baranov, Mykola; Shcherbyna, Yurii

Comprehensive analysis of few-shot image classification method using triplet loss

dc.citation.epage	109
dc.citation.issue	11
dc.citation.journalTitle	Вісник Національного університету "Львівська політехніка". Інформаційні системи та мережі
dc.citation.spage	103
dc.contributor.affiliation	Львівський національний університет ім. Івана Франка
dc.contributor.affiliation	Ivan Franko National University of Lviv
dc.contributor.author	Баранов, Микола
dc.contributor.author	Щербина, Юрій
dc.contributor.author	Baranov, Mykola
dc.contributor.author	Shcherbyna, Yurii
dc.coverage.placename	Львів
dc.coverage.placename	Lviv
dc.date.accessioned	2023-08-17T06:36:11Z
dc.date.available	2023-08-17T06:36:11Z
dc.date.created	2022-03-01
dc.date.issued	2022-03-01
dc.description.abstract	Задача класифікації зображень є дуже важливою сучасною проблемою в області комп’ютерного зору. Перші підходи до розв’язання цієї задачі полягали у використанні класичних алгоритмів. Незважаючи на певний прогрес, отриманий класичними підходами, більшість складніших задач класифікації зображень залишались нерозв’язаними до початку використання алгоритмів машинного навчання. Перші спроби застосування машинного навчання до задачі розпізнавання зображень допомогли класифікувати набори ознак, які опрацювати прямими алгоритмами не вдавалось. Проте видобування множини ознак залишалося за прямими алгоритмами тривалий час. Нещодавний прогрес у сфері глибокого навчання відкрив можливість побудови систем автоматичного видобування множини ознак. Це зумовило значний прогрес у області комп’ютерного бачення і не тільки. Обробка великомасштабних наборів даних призвела до прориву у задачах розпізнавання зображень. Проте з’явилося нове обмеження– залежність від кількості наявних проанотованих даних. Методи глибинного навчання для задачі класифікації зображення зазвичай потребують великої кількості проанотованих зображень. І більше, сучасні моделі схильні до неочікуваної поведінки на наборах даних з іншого домена (нових класів у випадку розпізнавання зображень). Методи навчання на малому наборі даних дозволяють під час тренування глибоких нейронних мереж використовувати значно менше даних, зберігаючи таку саму точність розпізнавання. Незважаючи на це, залишається компроміс між кількістю наявних даних та точністю моделі. В цій роботі ми побудували сіамську нейронну мережу на основі функції втрат трійки і дослідили, як наявна кількість даних впливає на точність розпізнавання сіамської нейронної мережі. Ми порівняли моделі, отримані навчанням на основі метрик, та базову модель, натреновану на великомасштабних наборах даних.
dc.description.abstract	Image classification task is a very important problem of a computer vision area. The first approaches to image classification tasks belong to a classic straightforward algorithm. Despite the successful applications of such algorithms a lot of image classification tasks had not been solved until machine learning approaches were involved in a computer vision area. An early successful result of machine learning applications helps researchers with extracted features classification which was not available without machine learning models. But handcrafter features were required which left the most complicated classification task impossible to solve. Recent success in deep learning allows researchers to implement automatic trainable feature extraction. This gave significant progress in the computer vision area last but not least. Processing large-scale datasets bring researchers great progress in automatic feature extraction thus combining such features with precious approaches led to groundbreaking in computer vision. But a new limitation has come - dependency on large amounts of data. Deep learning approaches to image classification task usually requires large-scale datasets. Moreover, modern models lead to unexpected behavior in distribution datasets. A few-shot learning approach of deep learning models allows us to dramatically reduce the amount of required data while keeping the same promising results. Despite reduced datasets, there is still a tradeoff between the amount of available data and trained model performance. In this paper, we implemented a siamese network based on triplet loss. Then, we investigate a relationship between the amount of available data and few-shot model performances. We compare the models obtained by metric-learning with baselines models trained using large-scale datasets.
dc.format.extent	103-109
dc.format.pages	7
dc.identifier.citation	Baranov M. Comprehensive analysis of few-shot image classification method using triplet loss / Mykola Baranov, Yurii Shcherbyna // Вісник Національного університету "Львівська політехніка". Інформаційні системи та мережі. — Lviv : Lviv Politechnic Publishing House, 2022. — No 11. — P. 103–109.
dc.identifier.citationen	Baranov M. Comprehensive analysis of few-shot image classification method using triplet loss / Mykola Baranov, Yurii Shcherbyna // Visnyk Natsionalnoho universytetu "Lvivska politekhnika". Informatsiini systemy ta merezhi. — Lviv : Lviv Politechnic Publishing House, 2022. — No 11. — P. 103–109.
dc.identifier.doi	doi.org/10.23939/sisn2022.11.103
dc.identifier.uri	https://ena.lpnu.ua/handle/ntb/59492
dc.language.iso	en
dc.publisher	Видавництво Львівської політехніки
dc.publisher	Lviv Politechnic Publishing House
dc.relation.ispartof	Вісник Національного університету "Львівська політехніка". Інформаційні системи та мережі, 11, 2022
dc.relation.references	1. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6), 679–698. https://doi.org/10.1109/TPAMI.1986.4767851.
dc.relation.references	2. Said, K. A. M., Jambek, A. B., & Sulaiman, N. (2016). A study of image processing using morphological opening and closing processes. International Journal of Control Theory and Applications, 9(31), 15–21. https://doi.org/10.1109/ICED.2016.7804697.
dc.relation.references	3. Ye, H. J., Ming, L., Zhan, D. C., & Chao, W. L. (2021). Few-shot learning with a strong teacher. arXiv preprint arXiv:2107.00197. https://doi.org/10.1109/TPAMI.2022.3160362.
dc.relation.references	4. Hoffer, E., & Ailon, N. (2015, October). Deep metric learning using triplet network. In International workshop on similarity-based pattern recognition, 84–92. Springer, Cham. https://doi.org/10.1007/978-3-319-24261-3_7.
dc.relation.references	5. Li, X., Wei, T., Chen, Y. P., Tai, Y. W., & Tang, C. K. (2020). Fss-1000: A 1000-class dataset for few-shot segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2869–2878. https://doi.org/10.1109/CVPR42600.2020.00294.
dc.relation.references	6. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255. Ieee. https://doi.org/10.1109/CVPR.2009.5206848.
dc.relation.references	7. Xuan, H., Stylianou, A., & Pless, R. (2020). Improved embeddings with easy positive triplet mining. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2474–2482. https://doi.org/10.1109/WACV45572.2020.9093432.
dc.relation.referencesen	1. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6), 679–698. https://doi.org/10.1109/TPAMI.1986.4767851.
dc.relation.referencesen	2. Said, K. A. M., Jambek, A. B., & Sulaiman, N. (2016). A study of image processing using morphological opening and closing processes. International Journal of Control Theory and Applications, 9(31), 15–21. https://doi.org/10.1109/ICED.2016.7804697.
dc.relation.referencesen	3. Ye, H. J., Ming, L., Zhan, D. C., & Chao, W. L. (2021). Few-shot learning with a strong teacher. arXiv preprint arXiv:2107.00197. https://doi.org/10.1109/TPAMI.2022.3160362.
dc.relation.referencesen	4. Hoffer, E., & Ailon, N. (2015, October). Deep metric learning using triplet network. In International workshop on similarity-based pattern recognition, 84–92. Springer, Cham. https://doi.org/10.1007/978-3-319-24261-3_7.
dc.relation.referencesen	5. Li, X., Wei, T., Chen, Y. P., Tai, Y. W., & Tang, C. K. (2020). Fss-1000: A 1000-class dataset for few-shot segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2869–2878. https://doi.org/10.1109/CVPR42600.2020.00294.
dc.relation.referencesen	6. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255. Ieee. https://doi.org/10.1109/CVPR.2009.5206848.
dc.relation.referencesen	7. Xuan, H., Stylianou, A., & Pless, R. (2020). Improved embeddings with easy positive triplet mining. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2474–2482. https://doi.org/10.1109/WACV45572.2020.9093432.
dc.relation.uri	https://doi.org/10.1109/TPAMI.1986.4767851
dc.relation.uri	https://doi.org/10.1109/ICED.2016.7804697
dc.relation.uri	https://doi.org/10.1109/TPAMI.2022.3160362
dc.relation.uri	https://doi.org/10.1007/978-3-319-24261-3_7
dc.relation.uri	https://doi.org/10.1109/CVPR42600.2020.00294
dc.relation.uri	https://doi.org/10.1109/CVPR.2009.5206848
dc.relation.uri	https://doi.org/10.1109/WACV45572.2020.9093432
dc.rights.holder	© Національний університет “Львівська політехніка”, 2022
dc.rights.holder	© Baranov M., Shcherbyna Y., 2022
dc.subject	методи навчання на малому наборі даних
dc.subject	навчання на одному прикладі
dc.subject	навчання на основі метрик
dc.subject	комп’ютерний зір
dc.subject	класифікація зображень
dc.subject	few-shot learning
dc.subject	zero-shot learning
dc.subject	metric learning
dc.subject	computer vision
dc.subject	deep learning
dc.subject	image classification
dc.subject.udc	004.89
dc.title	Comprehensive analysis of few-shot image classification method using triplet loss
dc.title.alternative	Комплексний аналіз техніки навчання на малому наборі даних для задачі класифікації методом оптимізації трійок
dc.type	Article

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 2022n11_Baranov_M-Comprehensive_analysis_103-109.pdf
Size:: 327.22 KB
Format:: Adobe Portable Document Format

Download

Name:: 2022n11_Baranov_M-Comprehensive_analysis_103-109__COVER.png
Size:: 437.91 KB
Format:: Portable Network Graphics

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.81 KB
Format:: Plain Text
Description:

Download

Collections

Вісник Національного університету "Львівська політехніка". Інформаційні системи та мережі. – 2022. – Випуск 11