Inspiration
How many times have you said this: "I talked about it yesterday, but I can't remember exactly what I said". So much important information flies around every day, information that is not recorded online. Hence, we created Parrot.AI. An end-to-end mobile application for iOS and Android devices built to remember your everyday conversations and act as your mobile secretary throughout your busy life.
What it does
Parrot.ai is a personal secretary that discreetly listens to everything captured by your mobile device's microphone 24/7. All this data is stored securely on our AWS infrastructure, where it is transcribed, undergoes personal information redaction, and prepared for AI insights. It has 3 key features:
- 24/7 audio recording on iOS and Android devices and data ingestion to AWS backend.
- Audio and text-based chat feature to retrieve any conversation, from anyone and anytime.
- Secure data processing with automated personal information redaction and privacy protection.
Additional Features
- Implements Retrieval Augmented Generation (RAG) through vector embeddings to provide accurate AI-powered insights and summaries about your past conversation.
- Secure storage of transcribed conversations on AWS S3 & AWS Relational Database Service (RDS)
- Leverages AWS Transcribe for accurate speech-to-text conversion and PII redaction.
- Google OAuth and Apple ID authentication for quick and seamless authentication process.
- Designed to boost productivity and aid in conversational learning in various contexts (personal, educational, professional).
How we built it
Parrot.AI is built on AWS cloud provider, making it scalable and robust with 4 key components:
Frontend: An UI is built on React Native with the help of Expo for fast prototyping for both iOS and Android devices. Its UI is flexible, working on any screen size and device type, with React Native stylesheets and Expo microphone capabilities. It integrates with pre-existing authentication providers like Google OAuth and Apple ID.
Data Ingestion Pipeline: Connected with a Flask API, audio data is sent from the frontend mobile interface to an AWS S3 bucket. In this bucket it is queued to AWS Transcribe for speech-to-text conversion, making it highly efficient.
Data Processing Pipeline: Transcribed text data is then converted to high-dimension vector embeddings with Amazon Titan V2 on AWS Bedrock. This embedding is stored on AWS RDS and available for similarity search for increased accuracy and text-based context.
Data Interaction Pipeline: The user can now ask parrot questions. Input questions are embedded using the same embedding as before. The similarity search between the question and the RDS database is run to find the most relevant text. The question and the text are then fed into a AWS bedrock nova AI model to provide an answer to the user.
Challenges we ran into
A key challenge we had to overcome was having to store the audio data in a way that could be easily reference. We overcame this challenge by converting the audio file into text and embedding that text into a number representation.
Another challenge we went into was to allow the continuous streaming of data from the app to the AWS backend. We initially tried Kinesis but realized we could use a simpler approach by uploading files directly to amazon S3 buckets.
Lastly, we had to figure out how to provide the AI with context relevant to the questions asked. We have large amounts of data that we would need to sieve through to provide the context. We acheived this by using an extension on RDS database that enabled similarity search with the embedded data.
Accomplishments that we're proud of
Parrot.ai is revolutionary in 2 ways:
- Unparalleled Accessibility and Versatility: Parrot.ai is designed for everyone, seamlessly integrating into the fabric of daily life. Its applications are vast, serving as an indispensable tool in personal, academic, and professional contexts alike.
- Optimized Performance Powered by AWS: By strategically leveraging a suite of advanced AWS tools, Parrot.ai achieves remarkable efficiency. This robust architecture ensures minimal latency and lag, providing users with a consistently fast and reliable experience as they capture, process, and retrieve their conversational data.
What we learned
Our core technology is built upon the effective utilization of AWS Transcribe for converting speech to text, EC2 instances for handling our workload, and implementing advanced vector embedding techniques with Titan and RDS to power our conversational search capabilities.
What's next for Parrot.AI
Parrot.AI was developed with speed of development and feature richness in mind. Currently, the biggest obstacle is the cost of Parrot.ai. in the future, by utilizing AWS lambda and more optimized RDS database, we can drastically reduce the cost of Parrot.ai, allowing it to be a profitable idea.
Built With
- amazon-rds-relational-database-service
- amazon-transcribe
- bedrock
- ec2
- expo.io
- flask
- node.js
- pgvector
- python
- react-native
- s3
- tailwind
Log in or sign up for Devpost to join the conversation.