Models and Methods for Speech Separation in Digital Systems

Tsemko, Andrii; Karbovnyk, Ivan

doi:doi.org/10.23939/acps2024.02.121

Models and Methods for Speech Separation in Digital Systems

Files

Primary 2024v9n2_Tsemko_A-Models_and_Methods_for_Speech_121-127.pdf (1.07 MB)

2024v9n2_Tsemko_A-Models_and_Methods_for_Speech_121-127__COVER.png (543.03 KB)

Date

2024-02-27

Authors

Tsemko, Andrii

Karbovnyk, Ivan

Publisher

Видавництво Львівської політехніки
Lviv Politechnic Publishing House

Abstract

The main purpose of the article is to describe state-of-the-art approaches to speech separation and demonstrate the structures and challenges of building and training such systems. Designing efficient optimized neural network model for speech recognition requires using encoder-decoder model structure with masks estimation flow. The fully-convolutinoal SuDoRM-Rf model demonstrates the high efficiency with relatively small number of parameters and can be boosted with accelerators, that supports convolutional operations. The highest separation performance has been shown by the SepTDA model with 24 dB in SI-SNR with 21.2 million of trainable parameters, while SuDoRM-Rf with only 2.66 million has demonsrated 12.02 dB. Another transformer-based neural network approaches has demonstrated almost the same performance as SepTDA model but requires more trainable parameters.

Keywords

Speech Separation, Speech Enhancement, Audio Processing, Neural Networks

Citation

Tsemko A. Models and Methods for Speech Separation in Digital Systems / Andrii Tsemko, Ivan Karbovnyk // Advances in Cyber-Physical Systems. — Lviv : Lviv Politechnic Publishing House, 2024. — Vol 9. — No 2. — P. 121–127.

URI

https://ena.lpnu.ua/handle/ntb/117385

Collections

Advances In Cyber-Physical Systems. – 2024. – Vol. 9, No. 2

Full item page

Models and Methods for Speech Separation in Digital Systems

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By