The inspiration for this project was "my laziness". It is a wild fantasy for almost every developer to speak to their machine and the machine automatically types the code for you. The challenge to use Azure AI to achieve this couldn't have been more accurate. Imagine travelling in an Uber and side by side writing your code by speaking to your phone and by the time you reach to your office, it automatically exports the .py file to your system.
What it does
Our service, Speech to Code showcases the ability to write code, by speaking to the machine. A demo has been given in the video which showcases a simple website to speak commands and it automatically writes your code.
Our system identifies the type of command given and accordingly extracts important information such as keywords and entities to build the line of code. Currently it supports the code syntax of Python. The following commands are ready to test for you -
- Create an integer.
- Create a string.
- Create a list.
- Create a For Loop.
- Print/Display any variable
- Delete a line.
- Delete everything.
- Add indents (to add \t to write in blocks inside for loops.)
- Remove indents.
There are two ways to test your code. Both ways need an Azure Speech To Text API subscription key.
- Open this link. Wait for it to load the SDK, the warning will hide on its own. Add your Azure Speech to Text subscription key and region of service and enjoy giving commands to it.
- Manual Testing - Clone the repository mentioned, create a .env file with sample contents given in test.env file. Create a python3 virtual env and pip install the requirements file. Run main.py and open index.html in your browser.
How I built it
As I came to know about this project too late, I had just about a day or two to come up with an MVP for the same. The tech stack includes -
- Basic Testing UI to send voice commands to. This could later be integrated in some Voice controlled device such as Echo, Google Home, and similar. This UI takes the speech and converts it to text using Azure Speech to Text APIs.
- Backend - Made by a simple flask API, it internally calls 2 types of service. a) LUIS - Language Understanding by Azure, which tells us which type of command has been given to the service. b) Azure Text Analytics API - To extract out important informations such as variable names, integer or string values, list definitions, loop limits, etc.
- Azure Language Understanding - LUIS.
- Azure Text Analytics - Language Cognitive Services.
- Azure Speech to Text Service.
- Azure App Service (to deploy the code.)
- Azure Storage (for static website hosting).
Challenges I ran into
- The biggest challenge was the time frame, as I came to know about this just 2 days before the deadline and with busy office hours, it was difficult to add in complex commands. They will be continuously added and published, now that it has started.
- To deploy the backend code APIs on Azure App Service, as I myself had little experience to deploy the same.
- To structure the coding language (spoken) in such a way that the system understands the natural language and takes into account the variable names and values with efficient accuracy.
Accomplishments that I'm proud of
- I am proud of to create a sample app which actually demonstrates how we can use speech to write code itself, making the life of a developer a little bit easier.
- I also am proud I could complete this sample in whatever time frame I had and to deploy the whole system onto Azure App Service, i have a working tester for the world to test and give me feedback.
What I learned
- I learned to use Azure's much accurate AI APIs, mostly speech and language, which makes the use of NLP in our system too easy.
- I learned how to deploy your application to Azure Cloud, using App Services and plans.
- I even learned using CSS a bit, as I am myself not a big fan of UI Development.
What's next for Speech To Code
- The next, for sure, is to expand the possibilities of commands for this service, such as creating complex mathematical expressions, while loops, condition checking (both variable to variable and with different data types).
- It would be to incorporate this service to a proper Voice Assistant Tool, which the developers can use it in their own simple ways, such as through their mobile or earphones.