Inspiration

My inspiration came from my furry companion who has separation anxiety. His name is Amadeus Maximus, a 4-year-old frenchie with a huge personality. After talking to other pet owners, I realized this is a common problem, especially after the pandemic. Even when their pets don't have severe separation anxiety, owners still worry. We can't watch a camera feed all the time when we're out. This led me to create ZenDoggo, a tool that analyzes sounds and helps pet owners understand their dog's behavior when they're away.

What it does

ZenDoggo is a prototype app that analyzes audio recordings to help pet owners understand their dog's behavior when home alone. Here's how it works:

  1. Users upload an audio file from their device.
  2. The app identifies distinct sound segments within the recording.
  3. ZenDoggo uses the YAMNet machine learning model to classify sounds (e.g., barking, whining, human speech).
  4. The categorized sound data is sent to the Gemini Generative AI API, which analyzes the audio for patterns and provides tailored suggestions to address potential separation anxiety.

How I built it

  • Sound Analysis: Used librosa to analyze and extract sound segments from the audio file as well as employing custom thresholds to isolate relevant sound segments and get rid of too much background noise.
  • ML-Powered Categorization: Integrated Google's YAMNet model to classify sound segments into specific categories (e.g., barking, whining, human speech).
  • Simplified Results: Created intuitive, higher-level categories to streamline YAMNet's detailed output, making results easier to understand.
  • Insight Generation: Leveraged the Gemini Generative AI API to analyze categorized audio, providing tailored insights and suggestions for potential separation anxiety.
  • User-Friendly Display: Developed a frontend interface for easy audio uploads and clear visualization of the analysis results.

Challenges I ran into

General: As a beginner coder, building a complex application presented a steep learning curve. Thankfully, resources like Gemini provided valuable guidance and mentorship, accelerating my progress exponentially when it came to coding.

App-Specific:

  • Sound Granularity: Isolating meaningful sounds within the audio required careful pre-processing and threshold adjustments to ensure accurate analysis.
  • Categorization: To improve user experience, I simplified YAMNet's detailed output by creating higher-level categories.
  • Formatting: Ensured seamless integration by stripping markdown formatting from Gemini's API responses.
  • Threshold Optimization: Fine-tuning thresholds was crucial to balance sensitivity (detecting dog whines) with filtering out background noise.

Accomplishments that I'm proud of

  • I successfully built this app within a limited timeframe, exceeding my own expectations.
  • I confidently integrated machine learning models and the Gemini API into my project, despite being new to both. This showcases my adaptability and eagerness to explore cutting-edge technologies.
  • Collaboration with AI: I effectively leveraged Gemini and ChatGPT as both coding resources and brainstorming partners, pushing the boundaries of creative AI collaboration.

What I learned

I think a better question is what I did NOT learn haha.

This project transformed my understanding of several key areas. I significantly improved my Python skills, familiarized myself with ML models, at least when it comes to sounds analysis models. And I now have a better understanding of Gemini API and its capabilities, I can't wait to do some model training!

What's next for ZenDoggo

Preparing for Production:

  • Scalability: Implement audio streaming for efficient handling of longer recordings, ensuring the app's readiness for real-world usage.
  • Model Refinement: Train the Gemini model with separation anxiety-specific datasets to enhance the accuracy and relevance of its insights and suggestions.

Long-Term Vision: I'm committed to continuously improving ZenDoggo to better support dog owners. My ultimate goal is to integrate the app with Google Home for seamless, convenient use!!

Share this project:

Updates