Метод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі

Цмоць, Іван; Опотяк, Юрій; Штогрінець, Богдан; Tsmots, Ivan; Opotyak, Yurii; Shtohrinets, Bohdan

doi:doi.org/10.23939/sisn2023.14.248

Метод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі

dc.citation.epage	266
dc.citation.issue	14
dc.citation.journalTitle	Вісник Національного університету “Львівська політехніка”. Серія: Інформаційні системи та мережі
dc.citation.spage	248
dc.contributor.affiliation	Національний університет “Львівська політехніка”
dc.contributor.affiliation	Lviv Polytechnic National University
dc.contributor.author	Цмоць, Іван
dc.contributor.author	Опотяк, Юрій
dc.contributor.author	Штогрінець, Богдан
dc.contributor.author	Tsmots, Ivan
dc.contributor.author	Opotyak, Yurii
dc.contributor.author	Shtohrinets, Bohdan
dc.coverage.placename	Львів
dc.coverage.placename	Lviv
dc.date.accessioned	2025-09-12T07:21:56Z
dc.date.created	2023-02-28
dc.date.issued	2023-02-28
dc.description.abstract	Розроблено граф-схему узагальненого алгоритму паралельно-потокового обчислення скалярного добутку, особливістю якої є використання однотипних операцій формування часткових добутків, починаючи з молодших розрядів множників, обчислення макрочасткового добутку та його додавання до часткового результату, зсунутого вправо на кількість розрядів, використаних під час формування часткових добутків. Запропоновано розроблення ПКВМ-структур пристроїв паралельно-потокового обчислення скалярного добутку виконувати за такими принципами: використання однотипних сходинок конвеєра; виконання обчислень на основі операцій додавання, інверсії та зсуву; здійснення обчислення скалярного добутку як єдиної операції; регулярності та локалізації зв’язків між сходинками конвеєра; узгодження тривалості конвеєрного такту із часом введення даних і часом виведення результатів обчислень; просторово-часового розпаралелювання процесу обчислення скалярного добутку. Розроблено алгоритм і структуру паралельно-потокового пристрою обчислення скалярного добутку з прямим формуванням часткових добутків на основі аналізу одного розряду множників, яка забезпечує роботу із найменшим конвеєрним тактом. Розроблено алгоритм і структуру паралельно-потокового пристрою обчислення скалярного добутку із формуванням часткових добутків для суми двох пар добутків з аналізом одного розряду множників, яку доцільно використовувати для невеликої кількості операндів. Розроблено алгоритм і структуру паралельно-потокового пристрою обчислення скалярного добутку із формуванням часткових добутків за модифікованим алгоритмом Бута, яка забезпечує зменшення витрат обладнання під час опрацювання операндів розрядністю n≥24. Розроблено алгоритм і структуру пристрою обчислення скалярного добутку з формуванням групових часткових добутків, яка забезпечує найменші витрати обладнання, якщо n=8, для N>8. Розроблено метод синтезу ПКВМ- пристроїв паралельно-потокового обчислення скалярного добутку в реальному часі, який за рахунок вибору алгоритму формування часткових добутків, структури пристрою із переліку розроблених і узгодження такту роботи конвеєра вибраної структури із часом надходження вхідних даних забезпечує високу ефективність використання обладнання.
dc.description.abstract	A graph scheme of a generalized algorithm for parallel stream calculation of the scalar product was developed. The proposed algorithm uses the same type of operations for forming a partial product that is calculated starting from the lowest digits of the multipliers. The developed algorithm of parallel stream calculation of the scalar product is performed with the use of operations for forming partial products, calculating the macro-partial product, and adding it to the partial result shifted to the right by the number of digits that were used in the formation of partial products. It is suggested that the development of FPGA structures of devices for parallel stream calculation of the scalar product be carried out according to the following principles: use of the same type of conveyor steps; performing calculations based on addition, inversion, and shift operations; performing the calculation of the scalar product as a single operation; regularity and localization of connections between conveyor steps; coordination of the duration of the conveyor time with the time of data input and the time of output of calculation results; space-time parallelization of the process of calculating the scalar product. The algorithm and structure of the parallel stream device for calculating the scalar product with direct formation of partial products based on the analysis of one order of multipliers, which ensures operation with the smallest conveyor cycle, has been developed. The algorithm and structure of the parallel stream device for calculating the scalar product with the formation of partial products for the sum of two pairs of products with the analysis of one order of multipliers, which is advisable to use for a small number of operands, have been developed. The algorithm and structure of a parallel stream device for calculating the scalar product with the formation of partial products according to the modified Booth algorithm have been developed, which ensures a reduction in equipment costs when processing operands with n≥24 bits. The algorithm and structure of the device for calculating the scalar product with the formation of group partial products have been developed, which provides the lowest equipment costs in the case of n=8 for N>8. A method for the synthesis of FPGA devices for parallel stream calculation of the scalar product in real-time has been developed. The proposed method ensures high efficiency of the use of the equipment due to the selection of the algorithm for the formation of partial products and the structure of the device from the list of developed ones and the coordination of the cycle of the conveyor of the selected structure with the time of arrival of input data.
dc.format.extent	248-266
dc.format.pages	19
dc.identifier.citation	Цмоць І. Метод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі / Іван Цмоць, Юрій Опотяк, Богдан Штогрінець // Вісник Національного університету “Львівська політехніка”. Серія: Інформаційні системи та мережі. — Львів : Видавництво Львівської політехніки, 2023. — № 14. — С. 248–266.
dc.identifier.citationen	Tsmots I. Method of synthesis of devices for parallel stream calculation of scalar product in real time / Ivan Tsmots, Yurii Opotyak, Bohdan Shtohrinets // Information Systems and Networks. — Lviv : Lviv Politechnic Publishing House, 2023. — No 14. — P. 248–266.
dc.identifier.doi	doi.org/10.23939/sisn2023.14.248
dc.identifier.uri	https://ena.lpnu.ua/handle/ntb/111708
dc.language.iso	uk
dc.publisher	Видавництво Львівської політехніки
dc.publisher	Lviv Politechnic Publishing House
dc.relation.ispartof	Вісник Національного університету “Львівська політехніка”. Серія: Інформаційні системи та мережі, 14, 2023
dc.relation.ispartof	Information Systems and Networks, 14, 2023
dc.relation.references	1. Sogi, N., Souza, L. S., Gatto, B. B., Fukui, K. (2020). Metric Learning with A-based Scalar Product for Image-set Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA. DOI: 10.1109/CVPRW50498.2020.00433.
dc.relation.references	2. Ludeno, G. (2018). Normalized Scalar Product Approach for Nearshore Bathymetric Estimation From X-Band Radar Images: An Assessment Based on Simulated and Measured Data. IEEE Journal of Oceanic Engineering, Vol. 43, No. 1, 221–237. DOI: 10.1109/JOE.2017.2758118.
dc.relation.references	3. Hong S., Lee I., Park Y. (2018). Optimizing a FPGA-based neural accelerator for small IoT devices. In 2018 International Conference on Electronics, Information, and Communication (ICEIC), Honolulu, HI, USA. DOI: 10.23919/ELINFOCOM.2018.8330546.
dc.relation.references	4. Tsmots, I., Rabyk, V., Teslyuk, V., Opotyak, Yu. (2023). Floating-Point Number Scalar Product Hardware Implementation for Embedded Systems. In 17th International Conference on the Experience of Designing and Application of CAD Systems (CADSM), Jaroslaw, Poland. DOI: 10.1109/CADSM58174.2023.10076502.
dc.relation.references	5. Drozd, J., Drozd, O., Nikul, V., Sulima, J. (2018). FPGA implementation of vertical addition with a bitwise pipeline of calculations. In 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT), Kyiv, Ukraine. DOI: 10.1109/DESSERT.2018.8409136.
dc.relation.references	6. Zhang, W., Zhang, C., Niu, L., Din, F. U., Farrukh, Jiang, H. (2022). An Efficient FPGA Design for Fixed point Exponential Calculation. In IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), Xi'an, China. DOI: 10.1109/ICTA56932.2022.9963050.
dc.relation.references	7. Tsmots, I. (2005). Information technologies and specialized tools for processing signals and images in real time. Lviv: UAP.
dc.relation.references	8. Rashkevych, Yu. M., Tkachenko, R. O., Tsmots, I. H., Peleshko, D. D. (2014). Neuro-like methods, algorithms and structures of real-time signal and image processing. Lviv Polytechnic Publishing House.
dc.relation.references	9. Tsmots, I. H., Tkachenko, R. O., Teslyuk, V. M., Riznyk, O. Ya., Kazymira, I. Ya. (2022). Smart systems: technologies, architectures, data processing, protection and coding. Lviv: SPOLOM.
dc.relation.references	10. Zong, P., Wang, Y., Xie, F. (2018). Embedded Software Fault Prediction Based on Back Propagation Neural Network. In IEEE International Conference on Software Quality, Reliability and Security Companion (QRSC), Lisbon, Portugal. DOI: 10.1109/QRS-C.2018.00098.
dc.relation.references	11. Kalichanin-Balich, I., Lopez-Martin, C. (2010). Applying a Feedforward Neural Network for Predicting Software Development Effort of Short-Scale Projects. In Eighth ACIS International Conference on Software Engineering Research, Management and Applications, Montreal, QC, Canada. DOI: 10.1109/SERA.2010.41.
dc.relation.references	12. Tsmots, I., Skorokhoda, O., Rabyk, V. (2018). Parallel algorithms and matrix structures for scalar product calculation. In 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine. DOI: 10.1109/TCSET.2018.8336347.
dc.relation.references	13. Nguyen, D. T., Nguyen, T. N., Kim, H., Lee, H. -J.. (2019). A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 27, No. 8, 1861–1873. DOI: 10.1109/TVLSI.2019.2905242.
dc.relation.references	14. Chan, D. (2023). The Next Frontier: From SoC to Heterogenous Integration of Chiplets. In International VLSI Symposium on Technology, Systems and Applications (VLSI-TSA/VLSI-DAT), HsinChu, Taiwan, 2023. DOI: 10.1109/VLSI-TSA/VLSI-DAT57221.2023.10134113..
dc.relation.references	15. Liang, L. Lu, Y., Xiao, Q., Yan, S. (2017). Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, USA. DOI: 10.1109/FCCM.2017.64.
dc.relation.references	16. Rekha, R., Menon, K. P. (2018). FPGA implementation of exponential function using cordic IP core for extended input range. In 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India. DOI: 10.1109/RTEICT42901.2018.9012611.
dc.relation.references	17. Pandey, J. G., Gurawa, A., Nehra, H., Karmakar, A. (2016). An efficient VLSI architecture for data encryption standard and its FPGA implementation. In 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA), Bengaluru, India. DOI: 10.1109/VLSI-SATA.2016.7593054.
dc.relation.references	18. Shrestha, R. (2017). High-speed and low-power VLSI-architecture for inexact speculative adder. In 2017 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSIDAT.2017.7939644.
dc.relation.references	19. Yu, Hao. (2017). Energy efficient VLSI circuits for machine learning on-chip. In International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSI-DAT.2017.7939671.
dc.relation.references	20. Nguyen, D. T., Kim, H., Lee, H.-J., Chang, I.-J. (2018). An Approximate Memory Architecture for a Reduction of Refresh Power Consumption in Deep Learning Applications. In IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy. DOI: 10.1109/ISCAS.2018.8351021
dc.relation.references	21. Tsmots, I. H., Skorohoda, O. V. (2011). Device for calculating the scalar product. Ukrainian patent for a utility model, No. 66138, Bulletin 24.
dc.relation.references	22. Tsmots, I. H., Skorokhoda, O. V., Teslyuk, V. M. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 101922, 13.05.2013, Bulletin No. 9.
dc.relation.references	23. Tsmots, I. H., Skorohoda, O. V., Medykovskyi, M. O. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 118596, 11.02.2019, Bulletin No. 3.
dc.relation.references	24. Tsmots, I., Rabyk, V., Kryvinska, N., Yatsymirskyy, M., Teslyuk, V. (2022). Design of the Processors for Fast Cosine and Sine Fourier Transforms. Circuits. Systems, and Signal Processing, 41(9), 4928–4951.
dc.relation.referencesen	1. Sogi, N., Souza, L. S., Gatto, B. B., Fukui, K. (2020). Metric Learning with A-based Scalar Product for Image-set Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA. DOI: 10.1109/CVPRW50498.2020.00433.
dc.relation.referencesen	2. Ludeno, G. (2018). Normalized Scalar Product Approach for Nearshore Bathymetric Estimation From X-Band Radar Images: An Assessment Based on Simulated and Measured Data. IEEE Journal of Oceanic Engineering, Vol. 43, No. 1, 221–237. DOI: 10.1109/JOE.2017.2758118.
dc.relation.referencesen	3. Hong S., Lee I., Park Y. (2018). Optimizing a FPGA-based neural accelerator for small IoT devices. In 2018 International Conference on Electronics, Information, and Communication (ICEIC), Honolulu, HI, USA. DOI: 10.23919/ELINFOCOM.2018.8330546.
dc.relation.referencesen	4. Tsmots, I., Rabyk, V., Teslyuk, V., Opotyak, Yu. (2023). Floating-Point Number Scalar Product Hardware Implementation for Embedded Systems. In 17th International Conference on the Experience of Designing and Application of CAD Systems (CADSM), Jaroslaw, Poland. DOI: 10.1109/CADSM58174.2023.10076502.
dc.relation.referencesen	5. Drozd, J., Drozd, O., Nikul, V., Sulima, J. (2018). FPGA implementation of vertical addition with a bitwise pipeline of calculations. In 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT), Kyiv, Ukraine. DOI: 10.1109/DESSERT.2018.8409136.
dc.relation.referencesen	6. Zhang, W., Zhang, C., Niu, L., Din, F. U., Farrukh, Jiang, H. (2022). An Efficient FPGA Design for Fixed point Exponential Calculation. In IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), Xi'an, China. DOI: 10.1109/ICTA56932.2022.9963050.
dc.relation.referencesen	7. Tsmots, I. (2005). Information technologies and specialized tools for processing signals and images in real time. Lviv: UAP.
dc.relation.referencesen	8. Rashkevych, Yu. M., Tkachenko, R. O., Tsmots, I. H., Peleshko, D. D. (2014). Neuro-like methods, algorithms and structures of real-time signal and image processing. Lviv Polytechnic Publishing House.
dc.relation.referencesen	9. Tsmots, I. H., Tkachenko, R. O., Teslyuk, V. M., Riznyk, O. Ya., Kazymira, I. Ya. (2022). Smart systems: technologies, architectures, data processing, protection and coding. Lviv: SPOLOM.
dc.relation.referencesen	10. Zong, P., Wang, Y., Xie, F. (2018). Embedded Software Fault Prediction Based on Back Propagation Neural Network. In IEEE International Conference on Software Quality, Reliability and Security Companion (QRSC), Lisbon, Portugal. DOI: 10.1109/QRS-P.2018.00098.
dc.relation.referencesen	11. Kalichanin-Balich, I., Lopez-Martin, C. (2010). Applying a Feedforward Neural Network for Predicting Software Development Effort of Short-Scale Projects. In Eighth ACIS International Conference on Software Engineering Research, Management and Applications, Montreal, QC, Canada. DOI: 10.1109/SERA.2010.41.
dc.relation.referencesen	12. Tsmots, I., Skorokhoda, O., Rabyk, V. (2018). Parallel algorithms and matrix structures for scalar product calculation. In 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine. DOI: 10.1109/TCSET.2018.8336347.
dc.relation.referencesen	13. Nguyen, D. T., Nguyen, T. N., Kim, H., Lee, H. -J.. (2019). A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 27, No. 8, 1861–1873. DOI: 10.1109/TVLSI.2019.2905242.
dc.relation.referencesen	14. Chan, D. (2023). The Next Frontier: From SoC to Heterogenous Integration of Chiplets. In International VLSI Symposium on Technology, Systems and Applications (VLSI-TSA/VLSI-DAT), HsinChu, Taiwan, 2023. DOI: 10.1109/VLSI-TSA/VLSI-DAT57221.2023.10134113..
dc.relation.referencesen	15. Liang, L. Lu, Y., Xiao, Q., Yan, S. (2017). Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, USA. DOI: 10.1109/FCCM.2017.64.
dc.relation.referencesen	16. Rekha, R., Menon, K. P. (2018). FPGA implementation of exponential function using cordic IP core for extended input range. In 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India. DOI: 10.1109/RTEICT42901.2018.9012611.
dc.relation.referencesen	17. Pandey, J. G., Gurawa, A., Nehra, H., Karmakar, A. (2016). An efficient VLSI architecture for data encryption standard and its FPGA implementation. In 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA), Bengaluru, India. DOI: 10.1109/VLSI-SATA.2016.7593054.
dc.relation.referencesen	18. Shrestha, R. (2017). High-speed and low-power VLSI-architecture for inexact speculative adder. In 2017 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSIDAT.2017.7939644.
dc.relation.referencesen	19. Yu, Hao. (2017). Energy efficient VLSI circuits for machine learning on-chip. In International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSI-DAT.2017.7939671.
dc.relation.referencesen	20. Nguyen, D. T., Kim, H., Lee, H.-J., Chang, I.-J. (2018). An Approximate Memory Architecture for a Reduction of Refresh Power Consumption in Deep Learning Applications. In IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy. DOI: 10.1109/ISCAS.2018.8351021
dc.relation.referencesen	21. Tsmots, I. H., Skorohoda, O. V. (2011). Device for calculating the scalar product. Ukrainian patent for a utility model, No. 66138, Bulletin 24.
dc.relation.referencesen	22. Tsmots, I. H., Skorokhoda, O. V., Teslyuk, V. M. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 101922, 13.05.2013, Bulletin No. 9.
dc.relation.referencesen	23. Tsmots, I. H., Skorohoda, O. V., Medykovskyi, M. O. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 118596, 11.02.2019, Bulletin No. 3.
dc.relation.referencesen	24. Tsmots, I., Rabyk, V., Kryvinska, N., Yatsymirskyy, M., Teslyuk, V. (2022). Design of the Processors for Fast Cosine and Sine Fourier Transforms. Circuits. Systems, and Signal Processing, 41(9), 4928–4951.
dc.rights.holder	© Національний університет “Львівська політехніка”, 2023
dc.rights.holder	© Цмоць І. Г., Опотяк Ю. В., Штогрінець Б. В., 2023
dc.subject	просторово-часове розпаралелювання
dc.subject	граф-схема узагальненого алгоритму
dc.subject	витрати обладнання
dc.subject	узгодження такту роботи конвеєра
dc.subject	space-time parallelization
dc.subject	graph scheme of the generalized algorithm
dc.subject	equipment costs
dc.subject	coordination of the conveyor cycle
dc.subject.udc	004.94
dc.title	Метод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі
dc.title.alternative	Method of synthesis of devices for parallel stream calculation of scalar product in real time
dc.type	Article

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 2023n14_Tsmots_I-Method_of_synthesis_of_devices_248-266.pdf
Size:: 12.32 MB
Format:: Adobe Portable Document Format

Download

Name:: 2023n14_Tsmots_I-Method_of_synthesis_of_devices_248-266__COVER.png
Size:: 428.95 KB
Format:: Portable Network Graphics

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.85 KB
Format:: Plain Text
Description:

Download

Collections

Вісник Національного університету "Львівська політехніка". Інформаційні системи та мережі. – 2023. – Випуск 14