There are many issues that can afflict older family members, especially those that are geographically isolated. The elderly are often targets for scammers, and may also confront loneliness and/or cognitive decline.
We wish to provide family members and caretakers with a toolkit to confront these issues. By analyzing phone conversations for both content and sentiment, we can identify potential scammers, flag possible changes in cognition, and alert family members to changes in emotional state. This allows for intervention to prevent family members from falling victim to fraud, and can also flag potential medical or social needs.
What it does
The goal of the project is real-time analysis of spoken conversation to detect signs of cognitive impairment. This demo runs analysis on a WAV file, converts the spoken words to text, and runs sentiment analysis on the text. This can be used to flag key words, such as "lonely" or "money," as well as indicating overall emotional state.
How we built it
Record audio of a phone conversation or conversation - we used WAV files from the internet. Convert spoken words (recording) to text - we used Google Cloud Speech API. Run sentiment analysis on text - we used the Watson API. Pull out sentiment analysis and present data to use - this was done statically, not dynamically, and the results put into an HTML page and Google Slides.
Challenges we ran into
Initially, we attempted to use Docker to run our code on AWS; we ran into audio file issues with Docker. We tried to use Sphinx to do voice recognition, but ran into prolonged configuration issues, and switched to the Google Cloud Speech API. We are using Watson for text analysis. Personnel availability was also an issue.
Accomplishments that we're proud of
We were able to implement multiple APIs to do analysis on spoken content.
What we learned
There are a number of open-source tools and APIs that let us make progress on this problem very quickly. Additionally, while there are academic studies pursing machine learning to diagnose cognitive impairment via speech analysis, we are not aware of any commercial solutions. We are not aware of any work comparing speech analysis for n=1 cohorts, i.e. a longitudinal study of single individuals to diagnose changes in cognition over time.
What's next for FamProtect
We would like to further automate the workflow, and move away from the Shell script that queries the Watson API. It should be possible to write the whole flow in Python. We would also like to be able to do real-time audio analysis via Watson or Google Cloud, but those are commercial resources that are not reasonable during a hackathon.