We were inspired by early computer software like ELIZA which simulates a conversation for psychotherapy purposes in the 1960s. We wanted to do something similar and apply modern computing technologies to improve healthcare and help those who are suffering from mental illness.
What it does
In our case, we saw the power and ease of use of Google's Cloud Platform ML APIs, so we envisioned an analytics product which gives therapists data from interview sessions with patients. Using natural language processing, facial recognition and speech intelligence, it provides metrics on speech and facial sentiment, word content frequency to aid the therapist's analysis of the patient.
How we built it
We built it on a Node.js backend and React.js frontend. The Google Cloud Platform APIs we utilized are the Vision API for facial sentiment analysis, Video Intelligence for speech to text capability, and Natural Language Processing for metrics on the patient's speech.
Challenges we ran into
We ran into many challenges in this project. Calling Google's API with the data means we had to manage various synchronous components to minimize the load time, and correctly parse the inputs and outputs. For inputs, one challenge was that we had to sample snapshots of the video at intervals for facial recognition. Also, the API did not seem to respond well to videos with poor visual or audio quality. For outputs, we needed to parse the data into chronological order for our timeline tool, which involves breaking transcriptions down to individual sentences and fetching their sentiment values from the ML engine and associating them with a specific timestamp to pass to the frontend.
What we learned
What's next for TherapyTracker
In the future we could further expand our tool by utilizing real specific training data instead of the pretrained cloud ML API which we found to be not as powerful. Having a specified ML engine means that we could extract more specific metrics that is more useful for the objective of our tool. For example, more detailed sentiment analysis allows us to capture finer details in mood, expression. Speech content entity detection will allow us to analyze trends in the specific content the person is talking about. We also mocked features like patient management and session management to give long term trends for multiple patients over multiple interview sessions.