Inspiration
We were inspired by the Beasely Neighborhood Association - Best Community Building Hack challenge, which challenged us to build a tool to help coordinate responses to community safety incidents among community service providers, outreach workers, and volunteers, while avoiding escalation to emergency services. Our group set out to build a community safety system that prioritized accessibility, safety, and health, while balancing privacy concerns and ethical & legal responsibilities. Our focus was on accessibility and safety for members of the community in distress, while also balancing privacy, consent, and ethical responsibility. Fundamentally, this project is about making sure that people in distress are recognized and supported by their own community.
What it does
The Community Assistive Monitor Model (CAMM) is a community assistant and safety alert system, designed to detect members of the community in distress and send alerts to community responders through our App interface, with the location of the person in distress and a set of instructions. These instructions are generated through a retrieval-augmented generation (RAG) pipeline, based on recognized safety and health guidelines from public health authorities. Additionally, CAMM’s facial recognition capability will recognize users who have previously registered and consented to share their medical information. The recognized user is tied to an anonymized user ID. This ID allows the system to include relevant, consented media context in the alert. A responder might learn, for example, that the person has dementia, a mobility limitation, or a medical condition that affects how they should be approached. This will assist community responders in assisting members of the community with specialized needs by providing relevant context.
How we built it
Our RAG pipeline was built using Moorcheh AI’s software development kit. The model is fed relevant, LLM-summarized documents containing guidelines for specific situations (mental health, disability, dementia, etc). Additionally, if our facial recognition system, trained using OpenCV, recognizes the user and obtains their unique User ID, this will be sent to our model so that it can retrieve the user’s medical information as additional context in generating its response. The App interface has a page for registration, where users can provide informed consent to enter any medical information they believe might be relevant for community responders, as well as register their face with the facial recognition model; a unique identifier will then be generated and stored in our database, tied to the medical information of that user. The App interface was built using Node.js, React.js, and deployed with Vercel and Railway. The medical information and user ID are ingested into a second “medical records” namespace for future retrieval by Moorcheh. In addition to OpenCV, we used dlib and mediapipe to build our facial recognition, body recognition, and motion detection systems. A rules-based system is implemented for determining whether a person has collapsed; upon detection of collapse, ElevenLabs voice generation prompts the person in distress to move their hand if they are okay, identifying itself as CAMM in order to reassure the distressed individual. If there is still no motion detected, the individual is identified as not responsive. Their location and, if possible, their unique user ID is sent to Moorcheh to be used to generate the response that will be sent through the App interface.
Challenges we ran into
Communication: Communication between our main Python script and the deployed interface between devices was an issue. Our project was somewhat ambitious given the time allotted, and there were some things we could have simplified for the MVP to ensure we completed everything in time.
Camera: Working with a camera was, at times, an incredibly difficult challenge. Whether the body and hand models were jittering, making readings inaccurate, or not, tuning was one of the biggest hassles. Finding a balance between an over or under sensitive model was a big challenge, especially under the time constraints of the hackathon.
Microphone: Because CAMM is an interactive model, we needed a way for it to interact with users. Similarly to other home devices, that have seen great success as an accessible tool, we opted for voice recognition. However, even some of the best available models struggle when faced with interpreting short phrases. They would tend to cut off the beginning or end of the dialogue, which makes it incredibly difficult to get a clear wake word, especially with the added challenge of interpreting the audio, which in itself was often inaccurate. To handle this, we supplied numerous alternative interpretations for the wake word, but the long-form dialogue was often up to interpretation from Moorcheh.
Accomplishments that we're proud of
We are still proud of the MVP we achieved, despite bugs, as we found it to be an excellent proof-of-concept for the implementation of this system on a larger scale. We truly believe that CAMM would be an excellent tool for improving community health and safety. We are also proud of the implementation and integration of multiple AI-based systems, particularly integrating Moorcheh AI’s RAG pipeline with the computer vision-based facial recognition and distress-detection system.
What we learned
We learned quite a bit about implementing RAG, inter-device communication, and computer vision. This was an all-hands-on-deck project, everyone had something new to work on and learn. Whether it was setting up the sdk and the prompts, detecting falls, detecting an interactive web app, or 3D modelling, our team did it all. We could not be happier with the outcome and what we learned together along the way.
What's next for CAMM?
One thing we hope to work on is improving accessibility. Currently, the system is not entirely friendly to hearing-impaired individuals. Working with accessibility experts to ensure the system takes into account differently abled individuals would be important for a full implementation of the system.
Additionally, current rules-based detection is limited to identifying the collapse of an individual. Due to limitations in computer vision and ethical obligations to avoid diagnosis with AI, we could not implement a system for identifying mental health distress, confused elderly, or overdose. We would be interested in implementing a more complex, emotional detection system for identifying extreme distress – this would be a significant challenge, but one we’re interested in taking on.
For the MVP, our App was deployed as a web application; due to the time limitations of the hackathon, we did not have sufficient time to develop a proper mobile application. We would ideally like to do so, ensuring encryption and strong inter-device communication capabilities.
Finally, the next major step for CAMM is an upgrade in hardware. Currently, for our MVP, we are using a laptop webcam as our makeshift CAMM – for a full implementation, we would want to deploy a camera with automatic swivel, high fidelity, and internet connection; as well as a microphone and speaker for audio detection and voice-generation.
Log in or sign up for Devpost to join the conversation.