Inspiration
The spark for PawSay came from a simple moment with my son. He looked at our cat Rosco and asked, "What do you think they're actually trying to say?"
That question stuck with me. I wanted to build something that could answer it—not just with random canned responses, but by actually analyzing the Rosco's unique sounds and body language.
However, I faced a massive obstacle: I have ZERO coding experience.
I decided that technical knowledge shouldn't be a barrier to creation. I turned to Google Gemini and AI agents to act as my hands, while I provided the vision. This project isn't just an app; it's proof that with the right AI tools, anyone can bring an idea to life.
What it does
PawSay is an AI-powered pet translator and health assistant.
Audio Translation: It records pet vocalizations and uses Gemini's multimodal capabilities to analyze pitch, tone, and duration to generate a "translated" message that reflects the pet's personality (e.g., a sassy cat or a dramatic dog). Visual Analysis: Users can take a photo of their pet, and the AI interprets body language cues (tail position, ear posture) to determine the pet's emotional state. Health & Training: Beyond fun translations, the app provides constructive veterinary and training advice based on the analysis.
How we built it
This application was built entirely through a collaboration between a human creator (me) and AI agents.
The Brain: We used Gemini models via the API for its speed and multimodal capabilities. It handles everything from interpreting a dog's bark to analyzing a cat's posture in a photo. The Body: The app is built with React and Vite for a responsive frontend, wrapped in Capacitor to function as a native mobile app on Android. The Nervous System: We utilized Firebase Cloud Functions as a secure backend to protect our API keys and orchestrate the AI analysis, ensuring a production-grade architecture.
Challenges we ran into
Building an app with no coding background was a rollercoaster.
Environment Hell: The biggest hurdle wasn't the logic—it was the environment. Setting up Android Studio, configuring the correct Java Development Kit (JDK), and managing environment variables like JAVA_HOME were daunting tasks that nearly derailed the project multiple times. Security: Understanding how to secure API keys (moving them from the frontend to a secure backend) was a steep learning curve, but essential for a real-world app. The "Black Box": Debugging is hard when you don't speak the language. I had to learn how to effectively prompt the AI to find errors in logs I couldn't read myself.
Accomplishments that we're proud of
Actually Shipping: Going from "zero code" to a deployed web app and a generated Android APK is my proudest achievement. Multimodal Integration: Successfully chaining audio recording and image capture into the Gemini API and getting coherent, funny, and useful responses back. Democratizing Creation: I proved to myself and my son that you don't need a computer science degree to build the future—you just need an idea and the resilience to iterate.
What we learned
I learned that prompt engineering is the new programming. My role shifted from "writer" to "architect," directing the AI on what to build rather than how to write the syntax. I gained a profound respect for the complexity of software development—deployment, state management, security—but also a thrill from realizing that these tools have lowered the barrier to entry significantly.
What's next for PawSay
We plan to expand the "Community" feature, allowing owners to share their pet's "translated" thoughts in a social feed. We also want to implement Gemini's future models for even deeper, more nuanced behavioral analysis for complex training questions.
Built With
- android-studio
- capacitor
- firebase-cloud-functions
- firebase-firestore
- firebase-hosting
- google-gemini-api
- react
- tailwind-css
- typescript
- vite
Log in or sign up for Devpost to join the conversation.