Embedding speech recognition tools for custom software: Engines Overview

Dovbysh, Arthur; Alieksieiev, Vladyslav

Embedding speech recognition tools for custom software: Engines Overview

dc.citation.epage	121
dc.citation.spage	114
dc.contributor.affiliation	Department of Applied Mathematics, Lviv Polytechnic National University
dc.contributor.author	Dovbysh, Arthur
dc.contributor.author	Alieksieiev, Vladyslav
dc.coverage.placename	Lviv
dc.coverage.temporal	25-27 June 2018
dc.date.accessioned	2018-09-03T11:41:08Z
dc.date.available	2018-09-03T11:41:08Z
dc.date.created	2018-06-25
dc.date.issued	2018-06-25
dc.description.abstract	Different solutions and tools for speech recognition are now available. Nevertheless, implementation of natural language processing still remains a current problem. Developing any custom software with a good style of UI/UX requires the integration of speech recognition. Evidently, the most common solution is to use some engine as an embedded standard tool. Here in the paper we are presenting an overview and an analysis of some popular speech recognition engines: Google Speech Recognition API, Microsoft Speech API, Yandex Speech Kit and Julius. These speech recognition tools are a readyto- serve and suitable to supplement your own software with a reliable voice command detection or voice control feature. The results of our analysis comes from an experiment of voice recognition using these tools as an embedded component in a custom software.
dc.format.extent	114-121
dc.format.pages	8
dc.identifier.citation	Dovbysh A. Embedding speech recognition tools for custom software: Engines Overview / Arthur Dovbysh, Vladyslav Alieksieiev // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 114–121. — (Part 2. Workshop conference tracks. Section I. Computational Linguistics).
dc.identifier.citationen	Dovbysh A. Embedding speech recognition tools for custom software: Engines Overview / Arthur Dovbysh, Vladyslav Alieksieiev // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 114–121. — (Part 2. Workshop conference tracks. Section I. Computational Linguistics).
dc.identifier.issn	2523-4013
dc.identifier.uri	https://ena.lpnu.ua/handle/ntb/42557
dc.language.iso	en
dc.publisher	Lviv Polytechnic National University
dc.relation.ispartof	Computational linguistics and intelligent systems (2), 2018
dc.relation.references	1. Lawrence R. Rabiner and Bernard Gold. Theory and Application of Digital Signal Processing — Prentice Hall, 1975 – 762 p.
dc.relation.references	2. Plotikov V., Sukhanov V., Jhygulevtsev Yu. Rechevoy dialog v sistemah upravleniya. – Moscow: Mashinostoenie, 1988 – 223 p. – In Russian [Pечевoй диaлoг в cиcтемaх упpaвления / В.Н.Плoтникoв, В.A.Cухaнoв, Ю.Н.Жигулевцев. – М.: Мaшинocтpoение, 1988. – 223 c. – ISBN 5-217-00148-8]
dc.relation.references	3. Yandex SpeechKit // Yandex – https://tech.yandex.ru/speechkit/ (Retrieved on May 2018)
dc.relation.references	4. Yandex.SpeechKit // Wikipedia.org, 18.05.2018 – https://ru.wikipedia.org/wiki/Yandex.SpeechKit (Retrieved on May 2018)
dc.relation.references	5. Minimum Prediction Residual Principle Applied to Speech Recognition / Itakura F. // IEEE Transactions on Acoustics, Speech, and Signal processing. – February 1975. – Vol. 23, No. 1. – P.67–72.
dc.relation.references	6. Cloud Speech-to-Text – Speech Recognition // Google Cloud – https://cloud.google.com/speech-to-text/ (Retrieved in May 2018)
dc.relation.references	7. Google launches an improved speech-to-text service for developers / F. Lardinois // Techcrunch.com – April 9, 2018. – https://techcrunch.com/2018/04/09/google-launchesan- improved-speech-to-text-service-for-developers/
dc.relation.references	8. Microsoft Speech API // Wikipedia.org, 08.11.2017 – https://en.wikipedia.org/wiki/Microsoft_Speech_API (Retrieved on May 2018)
dc.relation.references	9. Microsoft Speech Platform SDK 11 Requirements and Installation // Microsoft Developer Network – https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx (Retrieved on May, 2018)
dc.relation.references	10. Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine / Akinobu Lee // GitHub, April 2018 — https://github.com/julius-speech/julius (Retrieved on May 2018)
dc.relation.references	11. T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP), Vol. 4, pp. 476-479, 2000.
dc.relation.references	12. A. Lee, T. Kawahara and K. Shikano. "Julius – an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691-1694, 2001.
dc.relation.references	13. A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009.
dc.relation.referencesen	1. Lawrence R. Rabiner and Bernard Gold. Theory and Application of Digital Signal Processing - Prentice Hall, 1975 – 762 p.
dc.relation.referencesen	2. Plotikov V., Sukhanov V., Jhygulevtsev Yu. Rechevoy dialog v sistemah upravleniya, Moscow: Mashinostoenie, 1988 – 223 p, In Russian [Pechevoi dialoh v cictemakh uppavleniia, V.N.Plotnikov, V.A.Cukhanov, Iu.N.Zhihulevtsev, M., Mashinoctpoenie, 1988, 223 c, ISBN 5-217-00148-8]
dc.relation.referencesen	3. Yandex SpeechKit, Yandex – https://tech.yandex.ru/speechkit/ (Retrieved on May 2018)
dc.relation.referencesen	4. Yandex.SpeechKit, Wikipedia.org, 18.05.2018 – https://ru.wikipedia.org/wiki/Yandex.SpeechKit (Retrieved on May 2018)
dc.relation.referencesen	5. Minimum Prediction Residual Principle Applied to Speech Recognition, Itakura F., IEEE Transactions on Acoustics, Speech, and Signal processing, February 1975, Vol. 23, No. 1, P.67–72.
dc.relation.referencesen	6. Cloud Speech-to-Text – Speech Recognition, Google Cloud – https://cloud.google.com/speech-to-text/ (Retrieved in May 2018)
dc.relation.referencesen	7. Google launches an improved speech-to-text service for developers, F. Lardinois, Techcrunch.com – April 9, 2018, https://techcrunch.com/2018/04/09/google-launchesan- improved-speech-to-text-service-for-developers/
dc.relation.referencesen	8. Microsoft Speech API, Wikipedia.org, 08.11.2017 – https://en.wikipedia.org/wiki/Microsoft_Speech_API (Retrieved on May 2018)
dc.relation.referencesen	9. Microsoft Speech Platform SDK 11 Requirements and Installation, Microsoft Developer Network – https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx (Retrieved on May, 2018)
dc.relation.referencesen	10. Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine, Akinobu Lee, GitHub, April 2018 - https://github.com/julius-speech/julius (Retrieved on May 2018)
dc.relation.referencesen	11. T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP), Vol. 4, pp. 476-479, 2000.
dc.relation.referencesen	12. A. Lee, T. Kawahara and K. Shikano. "Julius – an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691-1694, 2001.
dc.relation.referencesen	13. A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009.
dc.relation.uri	https://tech.yandex.ru/speechkit/
dc.relation.uri	https://ru.wikipedia.org/wiki/Yandex.SpeechKit
dc.relation.uri	https://cloud.google.com/speech-to-text/
dc.relation.uri	https://techcrunch.com/2018/04/09/google-launchesan-
dc.relation.uri	https://en.wikipedia.org/wiki/Microsoft_Speech_API
dc.relation.uri	https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx
dc.relation.uri	https://github.com/julius-speech/julius
dc.rights.holder	© 2018 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors.
dc.subject	speech recognition
dc.subject	speech engine
dc.subject	API
dc.subject	voice command detection
dc.subject	voice control
dc.subject	Google
dc.subject	Microsoft
dc.subject	Yandex
dc.subject	Julius
dc.subject	overview and analysis
dc.title	Embedding speech recognition tools for custom software: Engines Overview
dc.type	Conference Abstract

Files

Original bundle

Now showing 1 - 2 of 2

Name:: COLINS_2018_2018v2_Dovbysh_A-Embedding_speech_recognition_114-121.pdf
Size:: 2.24 MB
Format:: Adobe Portable Document Format

Download

Name:: COLINS_2018_2018v2_Dovbysh_A-Embedding_speech_recognition_114-121__COVER.png
Size:: 281.87 KB
Format:: Portable Network Graphics

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.95 KB
Format:: Plain Text
Description:

Download

Collections

Computational linguistics and intelligent systems. – 2018 р.