Метод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі

dc.citation.epage266
dc.citation.issue14
dc.citation.journalTitleВісник Національного університету “Львівська політехніка”. Серія: Інформаційні системи та мережі
dc.citation.spage248
dc.contributor.affiliationНаціональний університет “Львівська політехніка”
dc.contributor.affiliationLviv Polytechnic National University
dc.contributor.authorЦмоць, Іван
dc.contributor.authorОпотяк, Юрій
dc.contributor.authorШтогрінець, Богдан
dc.contributor.authorTsmots, Ivan
dc.contributor.authorOpotyak, Yurii
dc.contributor.authorShtohrinets, Bohdan
dc.coverage.placenameЛьвів
dc.coverage.placenameLviv
dc.date.accessioned2025-09-12T07:21:56Z
dc.date.created2023-02-28
dc.date.issued2023-02-28
dc.description.abstractРозроблено граф-схему узагальненого алгоритму паралельно-потокового обчислення скалярного добутку, особливістю якої є використання однотипних операцій формування часткових добутків, починаючи з молодших розрядів множників, обчислення макрочасткового добутку та його додавання до часткового результату, зсунутого вправо на кількість розрядів, використаних під час формування часткових добутків. Запропоновано розроблення ПКВМ-структур пристроїв паралельно-потокового обчислення скалярного добутку виконувати за такими принципами: використання однотипних сходинок конвеєра; виконання обчислень на основі операцій додавання, інверсії та зсуву; здійснення обчислення скалярного добутку як єдиної операції; регулярності та локалізації зв’язків між сходинками конвеєра; узгодження тривалості конвеєрного такту із часом введення даних і часом виведення результатів обчислень; просторово-часового розпаралелювання процесу обчислення скалярного добутку. Розроблено алгоритм і структуру паралельно-потокового пристрою обчислення скалярного добутку з прямим формуванням часткових добутків на основі аналізу одного розряду множників, яка забезпечує роботу із найменшим конвеєрним тактом. Розроблено алгоритм і структуру паралельно-потокового пристрою обчислення скалярного добутку із формуванням часткових добутків для суми двох пар добутків з аналізом одного розряду множників, яку доцільно використовувати для невеликої кількості операндів. Розроблено алгоритм і структуру паралельно-потокового пристрою обчислення скалярного добутку із формуванням часткових добутків за модифікованим алгоритмом Бута, яка забезпечує зменшення витрат обладнання під час опрацювання операндів розрядністю n≥24. Розроблено алгоритм і структуру пристрою обчислення скалярного добутку з формуванням групових часткових добутків, яка забезпечує найменші витрати обладнання, якщо n=8, для N>8. Розроблено метод синтезу ПКВМ- пристроїв паралельно-потокового обчислення скалярного добутку в реальному часі, який за рахунок вибору алгоритму формування часткових добутків, структури пристрою із переліку розроблених і узгодження такту роботи конвеєра вибраної структури із часом надходження вхідних даних забезпечує високу ефективність використання обладнання.
dc.description.abstractA graph scheme of a generalized algorithm for parallel stream calculation of the scalar product was developed. The proposed algorithm uses the same type of operations for forming a partial product that is calculated starting from the lowest digits of the multipliers. The developed algorithm of parallel stream calculation of the scalar product is performed with the use of operations for forming partial products, calculating the macro-partial product, and adding it to the partial result shifted to the right by the number of digits that were used in the formation of partial products. It is suggested that the development of FPGA structures of devices for parallel stream calculation of the scalar product be carried out according to the following principles: use of the same type of conveyor steps; performing calculations based on addition, inversion, and shift operations; performing the calculation of the scalar product as a single operation; regularity and localization of connections between conveyor steps; coordination of the duration of the conveyor time with the time of data input and the time of output of calculation results; space-time parallelization of the process of calculating the scalar product. The algorithm and structure of the parallel stream device for calculating the scalar product with direct formation of partial products based on the analysis of one order of multipliers, which ensures operation with the smallest conveyor cycle, has been developed. The algorithm and structure of the parallel stream device for calculating the scalar product with the formation of partial products for the sum of two pairs of products with the analysis of one order of multipliers, which is advisable to use for a small number of operands, have been developed. The algorithm and structure of a parallel stream device for calculating the scalar product with the formation of partial products according to the modified Booth algorithm have been developed, which ensures a reduction in equipment costs when processing operands with n≥24 bits. The algorithm and structure of the device for calculating the scalar product with the formation of group partial products have been developed, which provides the lowest equipment costs in the case of n=8 for N>8. A method for the synthesis of FPGA devices for parallel stream calculation of the scalar product in real-time has been developed. The proposed method ensures high efficiency of the use of the equipment due to the selection of the algorithm for the formation of partial products and the structure of the device from the list of developed ones and the coordination of the cycle of the conveyor of the selected structure with the time of arrival of input data.
dc.format.extent248-266
dc.format.pages19
dc.identifier.citationЦмоць І. Метод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі / Іван Цмоць, Юрій Опотяк, Богдан Штогрінець // Вісник Національного університету “Львівська політехніка”. Серія: Інформаційні системи та мережі. — Львів : Видавництво Львівської політехніки, 2023. — № 14. — С. 248–266.
dc.identifier.citationenTsmots I. Method of synthesis of devices for parallel stream calculation of scalar product in real time / Ivan Tsmots, Yurii Opotyak, Bohdan Shtohrinets // Information Systems and Networks. — Lviv : Lviv Politechnic Publishing House, 2023. — No 14. — P. 248–266.
dc.identifier.doidoi.org/10.23939/sisn2023.14.248
dc.identifier.urihttps://ena.lpnu.ua/handle/ntb/111708
dc.language.isouk
dc.publisherВидавництво Львівської політехніки
dc.publisherLviv Politechnic Publishing House
dc.relation.ispartofВісник Національного університету “Львівська політехніка”. Серія: Інформаційні системи та мережі, 14, 2023
dc.relation.ispartofInformation Systems and Networks, 14, 2023
dc.relation.references1. Sogi, N., Souza, L. S., Gatto, B. B., Fukui, K. (2020). Metric Learning with A-based Scalar Product for Image-set Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA. DOI: 10.1109/CVPRW50498.2020.00433.
dc.relation.references2. Ludeno, G. (2018). Normalized Scalar Product Approach for Nearshore Bathymetric Estimation From X-Band Radar Images: An Assessment Based on Simulated and Measured Data. IEEE Journal of Oceanic Engineering, Vol. 43, No. 1, 221–237. DOI: 10.1109/JOE.2017.2758118.
dc.relation.references3. Hong S., Lee I., Park Y. (2018). Optimizing a FPGA-based neural accelerator for small IoT devices. In 2018 International Conference on Electronics, Information, and Communication (ICEIC), Honolulu, HI, USA. DOI: 10.23919/ELINFOCOM.2018.8330546.
dc.relation.references4. Tsmots, I., Rabyk, V., Teslyuk, V., Opotyak, Yu. (2023). Floating-Point Number Scalar Product Hardware Implementation for Embedded Systems. In 17th International Conference on the Experience of Designing and Application of CAD Systems (CADSM), Jaroslaw, Poland. DOI: 10.1109/CADSM58174.2023.10076502.
dc.relation.references5. Drozd, J., Drozd, O., Nikul, V., Sulima, J. (2018). FPGA implementation of vertical addition with a bitwise pipeline of calculations. In 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT), Kyiv, Ukraine. DOI: 10.1109/DESSERT.2018.8409136.
dc.relation.references6. Zhang, W., Zhang, C., Niu, L., Din, F. U., Farrukh, Jiang, H. (2022). An Efficient FPGA Design for Fixed point Exponential Calculation. In IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), Xi'an, China. DOI: 10.1109/ICTA56932.2022.9963050.
dc.relation.references7. Tsmots, I. (2005). Information technologies and specialized tools for processing signals and images in real time. Lviv: UAP.
dc.relation.references8. Rashkevych, Yu. M., Tkachenko, R. O., Tsmots, I. H., Peleshko, D. D. (2014). Neuro-like methods, algorithms and structures of real-time signal and image processing. Lviv Polytechnic Publishing House.
dc.relation.references9. Tsmots, I. H., Tkachenko, R. O., Teslyuk, V. M., Riznyk, O. Ya., Kazymira, I. Ya. (2022). Smart systems: technologies, architectures, data processing, protection and coding. Lviv: SPOLOM.
dc.relation.references10. Zong, P., Wang, Y., Xie, F. (2018). Embedded Software Fault Prediction Based on Back Propagation Neural Network. In IEEE International Conference on Software Quality, Reliability and Security Companion (QRSC), Lisbon, Portugal. DOI: 10.1109/QRS-C.2018.00098.
dc.relation.references11. Kalichanin-Balich, I., Lopez-Martin, C. (2010). Applying a Feedforward Neural Network for Predicting Software Development Effort of Short-Scale Projects. In Eighth ACIS International Conference on Software Engineering Research, Management and Applications, Montreal, QC, Canada. DOI: 10.1109/SERA.2010.41.
dc.relation.references12. Tsmots, I., Skorokhoda, O., Rabyk, V. (2018). Parallel algorithms and matrix structures for scalar product calculation. In 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine. DOI: 10.1109/TCSET.2018.8336347.
dc.relation.references13. Nguyen, D. T., Nguyen, T. N., Kim, H., Lee, H. -J.. (2019). A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 27, No. 8, 1861–1873. DOI: 10.1109/TVLSI.2019.2905242.
dc.relation.references14. Chan, D. (2023). The Next Frontier: From SoC to Heterogenous Integration of Chiplets. In International VLSI Symposium on Technology, Systems and Applications (VLSI-TSA/VLSI-DAT), HsinChu, Taiwan, 2023. DOI: 10.1109/VLSI-TSA/VLSI-DAT57221.2023.10134113..
dc.relation.references15. Liang, L. Lu, Y., Xiao, Q., Yan, S. (2017). Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, USA. DOI: 10.1109/FCCM.2017.64.
dc.relation.references16. Rekha, R., Menon, K. P. (2018). FPGA implementation of exponential function using cordic IP core for extended input range. In 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India. DOI: 10.1109/RTEICT42901.2018.9012611.
dc.relation.references17. Pandey, J. G., Gurawa, A., Nehra, H., Karmakar, A. (2016). An efficient VLSI architecture for data encryption standard and its FPGA implementation. In 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA), Bengaluru, India. DOI: 10.1109/VLSI-SATA.2016.7593054.
dc.relation.references18. Shrestha, R. (2017). High-speed and low-power VLSI-architecture for inexact speculative adder. In 2017 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSIDAT.2017.7939644.
dc.relation.references19. Yu, Hao. (2017). Energy efficient VLSI circuits for machine learning on-chip. In International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSI-DAT.2017.7939671.
dc.relation.references20. Nguyen, D. T., Kim, H., Lee, H.-J., Chang, I.-J. (2018). An Approximate Memory Architecture for a Reduction of Refresh Power Consumption in Deep Learning Applications. In IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy. DOI: 10.1109/ISCAS.2018.8351021
dc.relation.references21. Tsmots, I. H., Skorohoda, O. V. (2011). Device for calculating the scalar product. Ukrainian patent for a utility model, No. 66138, Bulletin 24.
dc.relation.references22. Tsmots, I. H., Skorokhoda, O. V., Teslyuk, V. M. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 101922, 13.05.2013, Bulletin No. 9.
dc.relation.references23. Tsmots, I. H., Skorohoda, O. V., Medykovskyi, M. O. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 118596, 11.02.2019, Bulletin No. 3.
dc.relation.references24. Tsmots, I., Rabyk, V., Kryvinska, N., Yatsymirskyy, M., Teslyuk, V. (2022). Design of the Processors for Fast Cosine and Sine Fourier Transforms. Circuits. Systems, and Signal Processing, 41(9), 4928–4951.
dc.relation.referencesen1. Sogi, N., Souza, L. S., Gatto, B. B., Fukui, K. (2020). Metric Learning with A-based Scalar Product for Image-set Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA. DOI: 10.1109/CVPRW50498.2020.00433.
dc.relation.referencesen2. Ludeno, G. (2018). Normalized Scalar Product Approach for Nearshore Bathymetric Estimation From X-Band Radar Images: An Assessment Based on Simulated and Measured Data. IEEE Journal of Oceanic Engineering, Vol. 43, No. 1, 221–237. DOI: 10.1109/JOE.2017.2758118.
dc.relation.referencesen3. Hong S., Lee I., Park Y. (2018). Optimizing a FPGA-based neural accelerator for small IoT devices. In 2018 International Conference on Electronics, Information, and Communication (ICEIC), Honolulu, HI, USA. DOI: 10.23919/ELINFOCOM.2018.8330546.
dc.relation.referencesen4. Tsmots, I., Rabyk, V., Teslyuk, V., Opotyak, Yu. (2023). Floating-Point Number Scalar Product Hardware Implementation for Embedded Systems. In 17th International Conference on the Experience of Designing and Application of CAD Systems (CADSM), Jaroslaw, Poland. DOI: 10.1109/CADSM58174.2023.10076502.
dc.relation.referencesen5. Drozd, J., Drozd, O., Nikul, V., Sulima, J. (2018). FPGA implementation of vertical addition with a bitwise pipeline of calculations. In 2018 IEEE 9th International Conference on Dependable Systems, Services and Technologies (DESSERT), Kyiv, Ukraine. DOI: 10.1109/DESSERT.2018.8409136.
dc.relation.referencesen6. Zhang, W., Zhang, C., Niu, L., Din, F. U., Farrukh, Jiang, H. (2022). An Efficient FPGA Design for Fixed point Exponential Calculation. In IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA), Xi'an, China. DOI: 10.1109/ICTA56932.2022.9963050.
dc.relation.referencesen7. Tsmots, I. (2005). Information technologies and specialized tools for processing signals and images in real time. Lviv: UAP.
dc.relation.referencesen8. Rashkevych, Yu. M., Tkachenko, R. O., Tsmots, I. H., Peleshko, D. D. (2014). Neuro-like methods, algorithms and structures of real-time signal and image processing. Lviv Polytechnic Publishing House.
dc.relation.referencesen9. Tsmots, I. H., Tkachenko, R. O., Teslyuk, V. M., Riznyk, O. Ya., Kazymira, I. Ya. (2022). Smart systems: technologies, architectures, data processing, protection and coding. Lviv: SPOLOM.
dc.relation.referencesen10. Zong, P., Wang, Y., Xie, F. (2018). Embedded Software Fault Prediction Based on Back Propagation Neural Network. In IEEE International Conference on Software Quality, Reliability and Security Companion (QRSC), Lisbon, Portugal. DOI: 10.1109/QRS-P.2018.00098.
dc.relation.referencesen11. Kalichanin-Balich, I., Lopez-Martin, C. (2010). Applying a Feedforward Neural Network for Predicting Software Development Effort of Short-Scale Projects. In Eighth ACIS International Conference on Software Engineering Research, Management and Applications, Montreal, QC, Canada. DOI: 10.1109/SERA.2010.41.
dc.relation.referencesen12. Tsmots, I., Skorokhoda, O., Rabyk, V. (2018). Parallel algorithms and matrix structures for scalar product calculation. In 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine. DOI: 10.1109/TCSET.2018.8336347.
dc.relation.referencesen13. Nguyen, D. T., Nguyen, T. N., Kim, H., Lee, H. -J.. (2019). A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 27, No. 8, 1861–1873. DOI: 10.1109/TVLSI.2019.2905242.
dc.relation.referencesen14. Chan, D. (2023). The Next Frontier: From SoC to Heterogenous Integration of Chiplets. In International VLSI Symposium on Technology, Systems and Applications (VLSI-TSA/VLSI-DAT), HsinChu, Taiwan, 2023. DOI: 10.1109/VLSI-TSA/VLSI-DAT57221.2023.10134113..
dc.relation.referencesen15. Liang, L. Lu, Y., Xiao, Q., Yan, S. (2017). Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs. In IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, USA. DOI: 10.1109/FCCM.2017.64.
dc.relation.referencesen16. Rekha, R., Menon, K. P. (2018). FPGA implementation of exponential function using cordic IP core for extended input range. In 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India. DOI: 10.1109/RTEICT42901.2018.9012611.
dc.relation.referencesen17. Pandey, J. G., Gurawa, A., Nehra, H., Karmakar, A. (2016). An efficient VLSI architecture for data encryption standard and its FPGA implementation. In 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA), Bengaluru, India. DOI: 10.1109/VLSI-SATA.2016.7593054.
dc.relation.referencesen18. Shrestha, R. (2017). High-speed and low-power VLSI-architecture for inexact speculative adder. In 2017 International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSIDAT.2017.7939644.
dc.relation.referencesen19. Yu, Hao. (2017). Energy efficient VLSI circuits for machine learning on-chip. In International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan. DOI: 10.1109/VLSI-DAT.2017.7939671.
dc.relation.referencesen20. Nguyen, D. T., Kim, H., Lee, H.-J., Chang, I.-J. (2018). An Approximate Memory Architecture for a Reduction of Refresh Power Consumption in Deep Learning Applications. In IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy. DOI: 10.1109/ISCAS.2018.8351021
dc.relation.referencesen21. Tsmots, I. H., Skorohoda, O. V. (2011). Device for calculating the scalar product. Ukrainian patent for a utility model, No. 66138, Bulletin 24.
dc.relation.referencesen22. Tsmots, I. H., Skorokhoda, O. V., Teslyuk, V. M. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 101922, 13.05.2013, Bulletin No. 9.
dc.relation.referencesen23. Tsmots, I. H., Skorohoda, O. V., Medykovskyi, M. O. Device for calculating the scalar product. Patent of Ukraine for the invention, No. 118596, 11.02.2019, Bulletin No. 3.
dc.relation.referencesen24. Tsmots, I., Rabyk, V., Kryvinska, N., Yatsymirskyy, M., Teslyuk, V. (2022). Design of the Processors for Fast Cosine and Sine Fourier Transforms. Circuits. Systems, and Signal Processing, 41(9), 4928–4951.
dc.rights.holder© Національний університет “Львівська політехніка”, 2023
dc.rights.holder© Цмоць І. Г., Опотяк Ю. В., Штогрінець Б. В., 2023
dc.subjectпросторово-часове розпаралелювання
dc.subjectграф-схема узагальненого алгоритму
dc.subjectвитрати обладнання
dc.subjectузгодження такту роботи конвеєра
dc.subjectspace-time parallelization
dc.subjectgraph scheme of the generalized algorithm
dc.subjectequipment costs
dc.subjectcoordination of the conveyor cycle
dc.subject.udc004.94
dc.titleМетод синтезу пристроїв паралельно-потокового обчислення скалярного добутку у реальному часі
dc.title.alternativeMethod of synthesis of devices for parallel stream calculation of scalar product in real time
dc.typeArticle

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
2023n14_Tsmots_I-Method_of_synthesis_of_devices_248-266.pdf
Size:
12.32 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
2023n14_Tsmots_I-Method_of_synthesis_of_devices_248-266__COVER.png
Size:
428.95 KB
Format:
Portable Network Graphics

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.85 KB
Format:
Plain Text
Description: