In many large companies, there are call centers, where people are calling to order something or complain to something. There are huge costs to operators and companies are trying to reduce these costs by using chatbots, but some people would really like to call instead of chatting with virtual assistant. Also it can be used as trigger for other processes.
What it does
This script can recognize the speech by using microphone as an input or by using AudioFile as an input. This input is recognized and converted to string. It can use multiple Engines for speech recognition based on your preferences, such as Google Cloud, Wit, Bing, etc... Thanks to that, you can also recognize more than 100 languages.
How I built it
I used python library for speech recognition, as UiPath accomplished pretty good integration with python. There is one activity for setting up parameters for speech recognition. This activity was programmed in C# and packaged as nugget package and implemented in UiPath.
Challenges I ran into
As the voice / speech is really a type of unstructured data, it's necessarily to use NLP algorithms for understanding recognized string. This can be next step, as we are unlimited in using Python :).
Accomplishments that I'm proud of
Speech recognition is working for me perfectly in case, I pronounce it correctly. Also I was able to completely understand, as python activity package is working in Uipath, so as I'm a python developer, I'm almost unlimited in programming in UiPath.
What I learned
How to use Python activity package and how easy is to use python libraries. Also as I have some C# experience, I can also prepare a lot of custom activities inside our company which will perfectly meet our requirements.
What's next for Speech Recognition
From my perspective, next step is to implement NLP algorithms for recognized text. But here, it cant be something generic, because it really depends on use case. But my next step will be the implementation of NLP algorithm for sorting and translation activity.