We were amused by the communication functionalities and the cognitive services provided Twilio and Microsoft Azure, so we shoved them all together in an irreverent hack.
What it does
- SMS: The user sends a regular sms to the app and the message is automatically analyzed and replied according to the positiveness of the text
- Call: The use leaves call the app and the speech is analyzed and replied back, with the same text, but with a soundtrack in the background, according to the content of the message
How we built it
Twilio is used as a front end interface to users, capturing both SMS and voice communication which is passed to a Flask backend running on Amazon EC2. The backend server is responsible for querying Microsoft Azure for speech to text and sentiment analysis, and implements logic for selecting the appropriate text response or mixing the appropriate music with the user's recorded voice according to their emotional valence. Twilio is again used to serve the text / audio back to the user.
Challenges we ran into
Handling audio was the most challenging, as the Twilio audio platform is asynchronous. Additionally, the interchange between mp3 and wav format required some hacking as many open libraries do not handle interconversion and different platforms (Azure, Twilio, librosa) require different filetypes