Embedding speech recognition tools for custom software: Engines Overview
dc.citation.epage | 121 | |
dc.citation.spage | 114 | |
dc.contributor.affiliation | Department of Applied Mathematics, Lviv Polytechnic National University | |
dc.contributor.author | Dovbysh, Arthur | |
dc.contributor.author | Alieksieiev, Vladyslav | |
dc.coverage.placename | Lviv | |
dc.coverage.temporal | 25-27 June 2018 | |
dc.date.accessioned | 2018-09-03T11:41:08Z | |
dc.date.available | 2018-09-03T11:41:08Z | |
dc.date.created | 2018-06-25 | |
dc.date.issued | 2018-06-25 | |
dc.description.abstract | Different solutions and tools for speech recognition are now available. Nevertheless, implementation of natural language processing still remains a current problem. Developing any custom software with a good style of UI/UX requires the integration of speech recognition. Evidently, the most common solution is to use some engine as an embedded standard tool. Here in the paper we are presenting an overview and an analysis of some popular speech recognition engines: Google Speech Recognition API, Microsoft Speech API, Yandex Speech Kit and Julius. These speech recognition tools are a readyto- serve and suitable to supplement your own software with a reliable voice command detection or voice control feature. The results of our analysis comes from an experiment of voice recognition using these tools as an embedded component in a custom software. | |
dc.format.extent | 114-121 | |
dc.format.pages | 8 | |
dc.identifier.citation | Dovbysh A. Embedding speech recognition tools for custom software: Engines Overview / Arthur Dovbysh, Vladyslav Alieksieiev // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 114–121. — (Part 2. Workshop conference tracks. Section I. Computational Linguistics). | |
dc.identifier.citationen | Dovbysh A. Embedding speech recognition tools for custom software: Engines Overview / Arthur Dovbysh, Vladyslav Alieksieiev // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 114–121. — (Part 2. Workshop conference tracks. Section I. Computational Linguistics). | |
dc.identifier.issn | 2523-4013 | |
dc.identifier.uri | https://ena.lpnu.ua/handle/ntb/42557 | |
dc.language.iso | en | |
dc.publisher | Lviv Polytechnic National University | |
dc.relation.ispartof | Computational linguistics and intelligent systems (2), 2018 | |
dc.relation.references | 1. Lawrence R. Rabiner and Bernard Gold. Theory and Application of Digital Signal Processing — Prentice Hall, 1975 – 762 p. | |
dc.relation.references | 2. Plotikov V., Sukhanov V., Jhygulevtsev Yu. Rechevoy dialog v sistemah upravleniya. – Moscow: Mashinostoenie, 1988 – 223 p. – In Russian [Pечевoй диaлoг в cиcтемaх упpaвления / В.Н.Плoтникoв, В.A.Cухaнoв, Ю.Н.Жигулевцев. – М.: Мaшинocтpoение, 1988. – 223 c. – ISBN 5-217-00148-8] | |
dc.relation.references | 3. Yandex SpeechKit // Yandex – https://tech.yandex.ru/speechkit/ (Retrieved on May 2018) | |
dc.relation.references | 4. Yandex.SpeechKit // Wikipedia.org, 18.05.2018 – https://ru.wikipedia.org/wiki/Yandex.SpeechKit (Retrieved on May 2018) | |
dc.relation.references | 5. Minimum Prediction Residual Principle Applied to Speech Recognition / Itakura F. // IEEE Transactions on Acoustics, Speech, and Signal processing. – February 1975. – Vol. 23, No. 1. – P.67–72. | |
dc.relation.references | 6. Cloud Speech-to-Text – Speech Recognition // Google Cloud – https://cloud.google.com/speech-to-text/ (Retrieved in May 2018) | |
dc.relation.references | 7. Google launches an improved speech-to-text service for developers / F. Lardinois // Techcrunch.com – April 9, 2018. – https://techcrunch.com/2018/04/09/google-launchesan- improved-speech-to-text-service-for-developers/ | |
dc.relation.references | 8. Microsoft Speech API // Wikipedia.org, 08.11.2017 – https://en.wikipedia.org/wiki/Microsoft_Speech_API (Retrieved on May 2018) | |
dc.relation.references | 9. Microsoft Speech Platform SDK 11 Requirements and Installation // Microsoft Developer Network – https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx (Retrieved on May, 2018) | |
dc.relation.references | 10. Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine / Akinobu Lee // GitHub, April 2018 — https://github.com/julius-speech/julius (Retrieved on May 2018) | |
dc.relation.references | 11. T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP), Vol. 4, pp. 476-479, 2000. | |
dc.relation.references | 12. A. Lee, T. Kawahara and K. Shikano. "Julius – an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691-1694, 2001. | |
dc.relation.references | 13. A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009. | |
dc.relation.referencesen | 1. Lawrence R. Rabiner and Bernard Gold. Theory and Application of Digital Signal Processing - Prentice Hall, 1975 – 762 p. | |
dc.relation.referencesen | 2. Plotikov V., Sukhanov V., Jhygulevtsev Yu. Rechevoy dialog v sistemah upravleniya, Moscow: Mashinostoenie, 1988 – 223 p, In Russian [Pechevoi dialoh v cictemakh uppavleniia, V.N.Plotnikov, V.A.Cukhanov, Iu.N.Zhihulevtsev, M., Mashinoctpoenie, 1988, 223 c, ISBN 5-217-00148-8] | |
dc.relation.referencesen | 3. Yandex SpeechKit, Yandex – https://tech.yandex.ru/speechkit/ (Retrieved on May 2018) | |
dc.relation.referencesen | 4. Yandex.SpeechKit, Wikipedia.org, 18.05.2018 – https://ru.wikipedia.org/wiki/Yandex.SpeechKit (Retrieved on May 2018) | |
dc.relation.referencesen | 5. Minimum Prediction Residual Principle Applied to Speech Recognition, Itakura F., IEEE Transactions on Acoustics, Speech, and Signal processing, February 1975, Vol. 23, No. 1, P.67–72. | |
dc.relation.referencesen | 6. Cloud Speech-to-Text – Speech Recognition, Google Cloud – https://cloud.google.com/speech-to-text/ (Retrieved in May 2018) | |
dc.relation.referencesen | 7. Google launches an improved speech-to-text service for developers, F. Lardinois, Techcrunch.com – April 9, 2018, https://techcrunch.com/2018/04/09/google-launchesan- improved-speech-to-text-service-for-developers/ | |
dc.relation.referencesen | 8. Microsoft Speech API, Wikipedia.org, 08.11.2017 – https://en.wikipedia.org/wiki/Microsoft_Speech_API (Retrieved on May 2018) | |
dc.relation.referencesen | 9. Microsoft Speech Platform SDK 11 Requirements and Installation, Microsoft Developer Network – https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx (Retrieved on May, 2018) | |
dc.relation.referencesen | 10. Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine, Akinobu Lee, GitHub, April 2018 - https://github.com/julius-speech/julius (Retrieved on May 2018) | |
dc.relation.referencesen | 11. T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP), Vol. 4, pp. 476-479, 2000. | |
dc.relation.referencesen | 12. A. Lee, T. Kawahara and K. Shikano. "Julius – an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691-1694, 2001. | |
dc.relation.referencesen | 13. A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009. | |
dc.relation.uri | https://tech.yandex.ru/speechkit/ | |
dc.relation.uri | https://ru.wikipedia.org/wiki/Yandex.SpeechKit | |
dc.relation.uri | https://cloud.google.com/speech-to-text/ | |
dc.relation.uri | https://techcrunch.com/2018/04/09/google-launchesan- | |
dc.relation.uri | https://en.wikipedia.org/wiki/Microsoft_Speech_API | |
dc.relation.uri | https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx | |
dc.relation.uri | https://github.com/julius-speech/julius | |
dc.rights.holder | © 2018 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors. | |
dc.subject | speech recognition | |
dc.subject | speech engine | |
dc.subject | API | |
dc.subject | voice command detection | |
dc.subject | voice control | |
dc.subject | ||
dc.subject | Microsoft | |
dc.subject | Yandex | |
dc.subject | Julius | |
dc.subject | overview and analysis | |
dc.title | Embedding speech recognition tools for custom software: Engines Overview | |
dc.type | Conference Abstract |
Files
License bundle
1 - 1 of 1