Ml models and optimization strategies for enhancing the performance of classification on mobile devices

Чорненький, В. Я.; Казимира, І. Я.; Chornenkyi, V. Y.; Kazymyra, I. Y.

Ml models and optimization strategies for enhancing the performance of classification on mobile devices

dc.citation.epage	82
dc.citation.issue	2
dc.citation.journalTitle	Український журнал інформаційних технологій
dc.citation.spage	74
dc.citation.volume	6
dc.contributor.affiliation	Національний університет “Львівська політехніка”
dc.contributor.affiliation	Lviv Polytechnic National University
dc.contributor.author	Чорненький, В. Я.
dc.contributor.author	Казимира, І. Я.
dc.contributor.author	Chornenkyi, V. Y.
dc.contributor.author	Kazymyra, I. Y.
dc.coverage.placename	Львів
dc.coverage.placename	Lviv
dc.date.accessioned	2025-11-19T08:26:02Z
dc.date.created	2024-02-27
dc.date.issued	2024-02-27
dc.description.abstract	У роботі розглянуто підходи до вдосконалення моделей машинного навчання у разі їх застосування для класифікації у мобільних додатках, вплив оптимізаційних технік на підвищення ефективності класифікації в реальному часі на мобільних пристроях. Основну увагу в дослідженні приділено порівнянню MobileNetV2, згорткової нейронної мережі, розробленої для мобільних додатків, і візуальних трансформерів (ViT), які продемонстрували успіх у завданнях розпізнавання зображень. Замість стандартних згорткових операцій MobileNetV2 використовує глибинні відокремлені згортки, що істотно зменшує кількість обчислень та параметрів моделі, а також використовує залишкові зв’язки, які дають змогу зберігати інформацію із попередніх шарів, покращуючи навчання моделі. ViT використовує механізм самоуваги для виявлення глобальних залежностей між частинами зображення, що дає змогу враховувати як локальні, так і глобальні ознаки без використання згорткових шарів. Зображення у ViT розділяють на патчі фіксованого розміру і кожен патч обробляють як “слово” в тексті у звичайних трансформерах, що спрощує роботу з великими зображеннями. У статті оцінено продуктивність обидвох моделей до і після застосування оптимізаційних технік, таких як квантизація – процес, який знижує точність коефіцієнтів моделі з 32-бітної до 8-бітної, що істотно зменшує розмір самої моделі та підвищує швидкість класифікації. Встановлено, що квантизація – одна із найефективніших оптимізаційних стратегій для мобільних середовищ, оскільки зменшує розмір моделі до 74 % і збільшує швидкість класифікації до 44 % у ViT. Крім того, досліджено роль технік оптимізації графа, таких як злиття операторів, обрізання та зміна послідовності виконання операцій, у зменшенні обчислювальної складності та підвищенні продуктивності на пристроях із обмеженими ресурсами. Ці техніки оптимізують виконання операцій у межах обчислювального графа, мінімізуючи використання пам’яті та підвищуючи паралелізм, що є критичним для додатків у реальному часі на мобільних пристроях. Здійснено експерименти та досліджено результати на різних наборах даних, зокрема MNIST і ASL Alphabet, що демонструють істотне підвищення продуктивності, досягнуте завдяки оптимізації. Дослідження показує, що післятренувальна квантизація та оптимізація графа можуть зменшити розмір моделі, час класифікації та використання центрального процесора, роблячи моделі машинного навчання придатнішими для мобільних додатків. Експерименти, здійснені на пристрої Xiaomi Redmi Note 8 Pro з операційною системою Android та використанням TensorFlow Lite для інтеграції, продемонструвалии практичні переваги такої оптимізації у реальних мобільних середовищах. Виявлено, що такі оптимізаційні техніки, як квантизація та оптимізація графа, важливі для розгортання моделей машинного навчання на мобільних пристроях, для яких обмеження ресурсів і продуктивність у реальному часі мають вирішальне значення. Ці техніки дають змогу істотно зменшити розмір моделі та час класифікації, не жертвуючи точністю, що уможливлює практичне використання моделей глибинного навчання у мобільних додатках.
dc.description.abstract	The paper highlights the increasing importance of machine learning (ML) in mobile applications, with mobile devices becoming ubiquitous due to their accessibility and functionality. Various ML models, including Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), are explored for their applications in real-time classification on mobile devices. The paper identifies key challenges in deploying these models, such as limited computational resources, battery consumption, and the need for real-time performance. Central to the research is the comparison of MobileNetV2, a lightweight CNN designed for mobile applications, and Vision Transformers (ViTs), which have shown success in image recognition tasks. MobileNetV2, with its depthwise separable convolutions and residual connections, is optimized for resource efficiency, while ViTs apply self-attention mechanisms to achieve competitive performance in image classification. The study evaluates the performance of both models before and after applying optimization techniques like quantization and graph optimization. It was discovered that quantization is one of the most effective optimization strategies for mobile environments, reducing model size by up to 74 % and improving inference speed by 44 % in ViTs. Additionally, graph optimization techniques, such as operator fusion, pruning, and node reordering, are examined for their role in reducing computational complexity and improving performance on resource-constrained devices. Experimental results on different datasets, including MNIST and the ASL Alphabet dataset, demonstrate the significant performance improvements achieved through optimization. The study shows that post-training quantization and graph optimization can reduce model size, inference time, and CPU usage, making ML models more suitable for mobile applications. The experiments were conducted on a Xiaomi Redmi Note 8 Pro device, showcasing the practical benefits of these optimizations in real-world mobile deployments. The research concludes that optimization techniques like quantization and graph optimization are essential for deploying ML models on mobile devices, where resource constraints and real-time performance are critical. It also provides valuable insights into how ML architectures can be optimized for mobile environments, contributing to the advancement of efficient AI-driven mobile applications.
dc.format.extent	74-82
dc.format.pages	9
dc.identifier.citation	Chornenkyi V. Y. Ml models and optimization strategies for enhancing the performance of classification on mobile devices / V. Y. Chornenkyi, I. Y. Kazymyra // Ukrainian Journal of Information Technology. — Lviv : Lviv Politechnic Publishing House, 2024. — Vol 6. — No 2. — P. 74–82.
dc.identifier.citationen	Chornenkyi V. Y. Ml models and optimization strategies for enhancing the performance of classification on mobile devices / V. Y. Chornenkyi, I. Y. Kazymyra // Ukrainian Journal of Information Technology. — Lviv : Lviv Politechnic Publishing House, 2024. — Vol 6. — No 2. — P. 74–82.
dc.identifier.doi	doi.org/10.23939/ujit2024.02.074
dc.identifier.uri	https://ena.lpnu.ua/handle/ntb/120435
dc.language.iso	en
dc.publisher	Видавництво Львівської політехніки
dc.publisher	Lviv Politechnic Publishing House
dc.relation.ispartof	Український журнал інформаційних технологій, 2 (6), 2024
dc.relation.ispartof	Ukrainian Journal of Information Technology, 2 (6), 2024
dc.relation.references	1. ITU/UN tech agency. (2024, May 19). Measuring Digital Development - Facts and Figures 2023. International Telecommunication Union. https://www.itu.int/hub/publication/d-ind-ict_mdd-2023-1/
dc.relation.references	2. Statista. (2024, June 14). Topic: US smartphone market. https://www.statista.com/topics/2711/us-smartphone-market/#topicOverview
dc.relation.references	3. Brand, L. (2023). Towards improved user experience for artificial intelligence systems. In Engineering Applications of Neural Networks (pp. 33-44). Cham. https://doi.org/10.1007/978-3-031-34204-2_4
dc.relation.references	4. Li, Y., Dang, X., Tian, H., Sun, T., Wang, Z., Ma, L., Klein, J., & Bissyande, T. F. (2022). AI-driven mobile apps: An explorative study. ArXiv. https://doi.org/10.48550/arxiv.2212.01635
dc.relation.references	5. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. ArXiv. https://doi.org/10.48550/arxiv.1706.03762
dc.relation.references	6. Sun, C., Shrivastava, A., Singh, S., & Gupta, A. K. (2017). Revisiting unreasonable effectiveness of data in deep learning era. In 2017 IEEE International Conference on Computer Vision (ICCV) (pp. 843-852). https://doi.org/10.48550/arXiv.1707.02968
dc.relation.references	7. Sandler, M., Howard, A. G., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4510-4520). https://doi.org/10.1109/CVPR.2018.00474
dc.relation.references	8. Howard, A. G., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q. V., & Adam, H. (2019). Searching for mobilenetv3. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 1314-1324). https://doi.org/10.1109/ICCV.2019.00140
dc.relation.references	9. Tan, M., & Le, Q. V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. ArXiv. https://doi.org/10.48550/arXiv.1905.11946
dc.relation.references	10. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jegou, H. (2021). Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, 10347-10357.
dc.relation.references	11. Xiao, T., Singh, M., Mintun, E., Darrell, T., Dollar, P., & Girshick, R. (2021). Early convolutions help transformers see better. ArXiv. https://doi.org/10.48550/arXiv.2106.14881
dc.relation.references	12. Ranftl, R., Bochkovskiy, A., & Koltun, V. (2021). Vision transformers for dense prediction. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 12159-12168). https://doi.org/10.1109/ICCV48922.2021.01196
dc.relation.references	13. Chen, L., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. ArXiv, abs/1706.05587.
dc.relation.references	14. Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L., & Zhang, L. (2021). Cvt: Introducing convolutions to vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 22-31). https://doi.org/10.1109/ICCV48922.2021.00009
dc.relation.references	15. Srinivas, A., Lin, T.-Y., Parmar, N., Shlens, J., Abbeel, P., & Vaswani, A. (2021). Bottleneck transformers for visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16519-16529). https://doi.org/10.1109/CVPR46437.2021.01625
dc.relation.references	16. d'Ascoli, S., Touvron, H., Leavitt, M., Morcos, A., Biroli, G., & Sagun, L. (2021). Convit: Improving vision transformers with soft convolutional inductive biases. International Conference on Machine Learning, 2286-2296. https://doi.org/10.1088/1742-5468/ac9830 https://doi.org/10.1088/1742-5468/ac9830
dc.relation.references	17. Ryoo, M., Piergiovanni, A. J., Arnab, A., Dehghani, M., & Angelova, A. (2021). Tokenlearner: Adaptive space-time tokenization for videos. Advances in Neural Information Processing Systems, 34.
dc.relation.references	18. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 10012-10022). https://doi.org/10.1109/ICCV48922.2021.00986
dc.relation.references	19. Caron, M. (2021). Emerging properties in self-supervised vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 9630-9640). https://doi.org/10.1109/ICCV48922.2021.00951
dc.relation.references	20. Kitaev, N., Kaiser, Ł., & Levskaya, A. (2020). Reformer: The efficient transformer. ArXiv. https://doi.org/10.48550/arXiv.2001.04451
dc.relation.references	21. Mehta, S., & Rastegari, M. (2021). MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer. ArXiv. https://doi.org/10.48550/arXiv.2110.02178
dc.relation.references	22. Wu, H., Judd, P., Zhang, X., Isaev, M., & Micikevicius, P. (2020). Integer quantization for deep learning inference: Principles and empirical evaluation. ArXiv. https://doi.org/10.48550/arXiv.2004.09602
dc.relation.references	23. Wan, L. (2014). A study of factors affecting mobile application download. Journal of Digital Convergence, 12, 189-196. https://doi.org/10.14400/JDC.2014.12.7.189
dc.relation.referencesen	1. ITU/UN tech agency. (2024, May 19). Measuring Digital Development - Facts and Figures 2023. International Telecommunication Union. https://www.itu.int/hub/publication/d-ind-ict_mdd-2023-1/
dc.relation.referencesen	2. Statista. (2024, June 14). Topic: US smartphone market. https://www.statista.com/topics/2711/us-smartphone-market/#topicOverview
dc.relation.referencesen	3. Brand, L. (2023). Towards improved user experience for artificial intelligence systems. In Engineering Applications of Neural Networks (pp. 33-44). Cham. https://doi.org/10.1007/978-3-031-34204-2_4
dc.relation.referencesen	4. Li, Y., Dang, X., Tian, H., Sun, T., Wang, Z., Ma, L., Klein, J., & Bissyande, T. F. (2022). AI-driven mobile apps: An explorative study. ArXiv. https://doi.org/10.48550/arxiv.2212.01635
dc.relation.referencesen	5. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. ArXiv. https://doi.org/10.48550/arxiv.1706.03762
dc.relation.referencesen	6. Sun, C., Shrivastava, A., Singh, S., & Gupta, A. K. (2017). Revisiting unreasonable effectiveness of data in deep learning era. In 2017 IEEE International Conference on Computer Vision (ICCV) (pp. 843-852). https://doi.org/10.48550/arXiv.1707.02968
dc.relation.referencesen	7. Sandler, M., Howard, A. G., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4510-4520). https://doi.org/10.1109/CVPR.2018.00474
dc.relation.referencesen	8. Howard, A. G., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q. V., & Adam, H. (2019). Searching for mobilenetv3. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 1314-1324). https://doi.org/10.1109/ICCV.2019.00140
dc.relation.referencesen	9. Tan, M., & Le, Q. V. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. ArXiv. https://doi.org/10.48550/arXiv.1905.11946
dc.relation.referencesen	10. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jegou, H. (2021). Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, 10347-10357.
dc.relation.referencesen	11. Xiao, T., Singh, M., Mintun, E., Darrell, T., Dollar, P., & Girshick, R. (2021). Early convolutions help transformers see better. ArXiv. https://doi.org/10.48550/arXiv.2106.14881
dc.relation.referencesen	12. Ranftl, R., Bochkovskiy, A., & Koltun, V. (2021). Vision transformers for dense prediction. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 12159-12168). https://doi.org/10.1109/ICCV48922.2021.01196
dc.relation.referencesen	13. Chen, L., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. ArXiv, abs/1706.05587.
dc.relation.referencesen	14. Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L., & Zhang, L. (2021). Cvt: Introducing convolutions to vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 22-31). https://doi.org/10.1109/ICCV48922.2021.00009
dc.relation.referencesen	15. Srinivas, A., Lin, T.-Y., Parmar, N., Shlens, J., Abbeel, P., & Vaswani, A. (2021). Bottleneck transformers for visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16519-16529). https://doi.org/10.1109/CVPR46437.2021.01625
dc.relation.referencesen	16. d'Ascoli, S., Touvron, H., Leavitt, M., Morcos, A., Biroli, G., & Sagun, L. (2021). Convit: Improving vision transformers with soft convolutional inductive biases. International Conference on Machine Learning, 2286-2296. https://doi.org/10.1088/1742-5468/ac9830 https://doi.org/10.1088/1742-5468/ac9830
dc.relation.referencesen	17. Ryoo, M., Piergiovanni, A. J., Arnab, A., Dehghani, M., & Angelova, A. (2021). Tokenlearner: Adaptive space-time tokenization for videos. Advances in Neural Information Processing Systems, 34.
dc.relation.referencesen	18. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 10012-10022). https://doi.org/10.1109/ICCV48922.2021.00986
dc.relation.referencesen	19. Caron, M. (2021). Emerging properties in self-supervised vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 9630-9640). https://doi.org/10.1109/ICCV48922.2021.00951
dc.relation.referencesen	20. Kitaev, N., Kaiser, Ł., & Levskaya, A. (2020). Reformer: The efficient transformer. ArXiv. https://doi.org/10.48550/arXiv.2001.04451
dc.relation.referencesen	21. Mehta, S., & Rastegari, M. (2021). MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer. ArXiv. https://doi.org/10.48550/arXiv.2110.02178
dc.relation.referencesen	22. Wu, H., Judd, P., Zhang, X., Isaev, M., & Micikevicius, P. (2020). Integer quantization for deep learning inference: Principles and empirical evaluation. ArXiv. https://doi.org/10.48550/arXiv.2004.09602
dc.relation.referencesen	23. Wan, L. (2014). A study of factors affecting mobile application download. Journal of Digital Convergence, 12, 189-196. https://doi.org/10.14400/JDC.2014.12.7.189
dc.relation.uri	https://www.itu.int/hub/publication/d-ind-ict_mdd-2023-1/
dc.relation.uri	https://www.statista.com/topics/2711/us-smartphone-market/#topicOverview
dc.relation.uri	https://doi.org/10.1007/978-3-031-34204-2_4
dc.relation.uri	https://doi.org/10.48550/arxiv.2212.01635
dc.relation.uri	https://doi.org/10.48550/arxiv.1706.03762
dc.relation.uri	https://doi.org/10.48550/arXiv.1707.02968
dc.relation.uri	https://doi.org/10.1109/CVPR.2018.00474
dc.relation.uri	https://doi.org/10.1109/ICCV.2019.00140
dc.relation.uri	https://doi.org/10.48550/arXiv.1905.11946
dc.relation.uri	https://doi.org/10.48550/arXiv.2106.14881
dc.relation.uri	https://doi.org/10.1109/ICCV48922.2021.01196
dc.relation.uri	https://doi.org/10.1109/ICCV48922.2021.00009
dc.relation.uri	https://doi.org/10.1109/CVPR46437.2021.01625
dc.relation.uri	https://doi.org/10.1088/1742-5468/ac9830
dc.relation.uri	https://doi.org/10.1109/ICCV48922.2021.00986
dc.relation.uri	https://doi.org/10.1109/ICCV48922.2021.00951
dc.relation.uri	https://doi.org/10.48550/arXiv.2001.04451
dc.relation.uri	https://doi.org/10.48550/arXiv.2110.02178
dc.relation.uri	https://doi.org/10.48550/arXiv.2004.09602
dc.relation.uri	https://doi.org/10.14400/JDC.2014.12.7.189
dc.rights.holder	© Національний університет “Львівська політехніка”, 2024
dc.subject	глибинне навчання
dc.subject	згорткова нейронна мережа
dc.subject	візуальні трансформери
dc.subject	мобільні застосунки
dc.subject	deep learning
dc.subject	convolutional neural networks
dc.subject	vision transformers
dc.subject	mobile applications
dc.title	Ml models and optimization strategies for enhancing the performance of classification on mobile devices
dc.title.alternative	Моделі машинного навчання та оптимізаційні стратегії для підвищення ефективності класифікації на мобільних пристроях
dc.type	Article

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 2024v6n2_Chornenkyi_V_Y-Ml_models_and_optimization_74-82.pdf
Size:: 718.16 KB
Format:: Adobe Portable Document Format

Download

Name:: 2024v6n2_Chornenkyi_V_Y-Ml_models_and_optimization_74-82__COVER.png
Size:: 1.62 MB
Format:: Portable Network Graphics

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.82 KB
Format:: Plain Text
Description:

Download

Collections

Ukrainian Journal of Information Technology. – 2024. – Vol. 6, No. 2