NOTE : I have not used knurld API because of the issues I faced in each stage of the process(registration, enrolling, verification through mobile). The major issue is the word based approach and the enforcement of audio as URL (hence dropbox???) and not as a multipart-file. However I've submitted my project based on Microsoft Speaker recognition. Even Microsoft's API is in preview(beta) and has a lot of issues related to it. I have been working in integrating with knurld when I post this submission. If I succeed knowing about you complex API, I'll hopefully replace all the old references. Sorry to say this, Knurld guys should work of creating a readable documantations/samples with no repeat in any references.
Inspiration
As a part of my experiment on Artificial Intelligence and the evolving trend of mobile based intelligent personal assistants, I have come up with an idea to create a personal assistant that helps us bank. It basically uses voice for everything it processes, including passwords.
What it does
A system that helps us bank smart, spend smart and invest smart. We name our system Barbara, after the french singer. The major focus of our app is to assist the bank customer to manage is account as well as his usual life tied to banking. The app is typically a speech based personal assistant whose primary focus is on banking. More details in the documentation.
How I built it
Following are the technologies used in building the prototype.
- Application server: Python 2.6 is used as the backend along with Flask library to built an MVC structure along with Jinja 2 html templating system for html views and exposing endpoints for the mobile application.
- Database: MySQL 5.5 is used to store the user data, route information along with the travel history of the user.(schema above)
- UI Framework: Bootstrap 3.2.1 CSS and JS for the HTML views.
- Cloud to deploy application: Openshift cloud from Redhat has been used to host the application and it is available with the load balancer. Public URL of the website : click here
- Android native: Android native with material design.
- Microsoft Cognitive solutions for Speaker verification: Microsoft biometric experimentation for smarter ways of user interactions involves various features involving voice and speaker recognition. ## Challenges I ran into Voice is the major challenge that I face. Accent, pronunciation, pauses and punctuations, Uff. Fill of misinterpretations and error responses. However I've managed to work with the upcoming API provided by Microsoft. Android voice recording was one another problem that stuck me hard. As all these voice APIs need .wav files while native android only allows .acc files as default recorder formats. ## Accomplishments that I'm proud of The chat view is the huge accomplishment as of now. It came out pretty well with the voice response. The second toughest thing is the preferences screen which used simple wizard type navigation. ## What I learned How complex a biometric system can ever be. Real time inputs like voice and images are so complex to crack. Even though I've just used the API, I see how hard the background process of authentication be. World is full of new and complex things, once you are open to see the oppurtunity and learning in everything the amount of knowledge we gain will be paramount. ## What's next for Barbara You tell me. Technology always evolves. With the advances in the technology we can always create miracles, old or new :) Love coding.
Built With
- android
- flask
- microsoft-oxford
- mysql
- python
- retrofit
- sqlalchemy
Log in or sign up for Devpost to join the conversation.