Onboarding, Login, Signup
Explore Characters and Group chats
Your chats and group chats
Talk with a characters or a group of characters
Chat with a characters or multiple characters
Surpriseeeee!
Create your own character with Dall-e 3-generated Avatar
Your profile here
Edit your characters or see characters you liked
Report - Feedback

Talk To Listen - AI-powered Talk App and Texting Platform

Discover a new world of interaction with 'Talk To Listen' on your mobile phone – where your voice brings characters to life! Engage in seamless conversations with a diverse universe of characters, each boasting their own unique personality and voice.

Check out website for download links and previews.

About Talk To Listen

What is Talk To Listen?

Talk To Listen is an AI-powered talk and texting multimodal mobile app that lets you engage in conversations with a diverse universe of characters. Each character has its own unique personality and voice, making every interaction a unique experience. You can talk to characters, listen to their stories, and text them to continue the conversation.

Features

Voice-Activated Conversations: Interact with characters using your voice, you don't have to type anything, or touch the screen, just talk and listen.
Diverse Characters: Engage with a wide range of characters, each with their own unique personality and voice.
Real-Time Interaction: Experience real-time responses to your voice commands.
Group Conversations: Talk with multiple characters at once.
Immersive Storytelling: Dive into a world of storytelling and adventure.
Customizable Characters: Personalize your characters with unique traits and characteristics.
Cross-Platform Compatibility: Access 'Talk To Listen' on any device, anytime, anywhere.
Multi-Language Support: Communicate with characters in multiple languages.
Safe and Secure: Enjoy a safe and secure environment for conversations, generating content that is safe for all ages.

Inspiration

One late night, far from home, family, and friends, I felt a profound sense of loneliness. I yearned for someone to talk to, someone who would listen and understand. That's when the idea struck me—why not create a space where anyone can feel heard? This thought led to the birth of "Talk To Listen," an AI-powered platform where you can talk and connect with characters who listen. Imagine an app that doesn't even require you to touch your phone; it's like having a friend, a group of friends, or your family right beside you, ready to hear your story. This concept is how "Talk To Listen" came to life, designed for those moments when you just need to talk.

How it can help people and the community

Talk To Listen can have a significant impact across different age groups and life situations.

Enhancing Social Connectivity: In a recent paper from Frontiers in Public Health, it's highlighted that the sense of loneliness significantly affects the quality of life among older adults. With projections from the US Census Bureau indicating a doubling of the population of Americans aged 65 and above by 2060, addressing and alleviating feelings of loneliness and depression becomes imperative. Our app, 'Talk to Listen', aims to combat this issue by providing a platform for friendly interactions and anthropomorphic greetings. Companionship serves as a crucial factor in mitigating loneliness across various age groups and life circumstances. Technological solutions, including voice assistants, companion robots, and conversational AI systems, show promise in tackling loneliness, especially among the elderly. By providing diverse forms of companionship and support, 'Talk to Listen' has the potential to enhance users' sense of connection and well-being. Through our app, Talk To Listen endeavor to make such interventions widely accessible, offering companionship to all, regardless of geographical location, financial status, or age.

Comprehensive Language/Culture Learning: According to a paper which was published in the Journal of Artificial Intelligence Research, discusses the use of unsupervised learning approaches for word acquisition in a virtual world setting. By incorporating user language behavior, domain knowledge, and conversation context, these methods aim to improve language learning outcomes. Additionally, a study from MDPI explores the integration of conversational AI in language learning among future educators, highlighting positive perceptions towards such technology. Almost 70% of language learners perceive chatbots as useful and easy for learning language purposes. Building upon these insights, 'Talk to Listen' harnesses the power of unsupervised learning and conversational AI to revolutionize personalized language learning. "Talk to Listen" also serves as a comprehensive platform for exploring diverse cultures, offering immersive experiences that delve into the rich tapestry of global traditions, customs, and societal norms. Through interactive lessons, cultural insights, and real-world scenarios, users gain a deeper understanding of the interconnectedness between language and culture, fostering appreciation and empathy for different ways of life.

Fostering Communication Skills: Utilizing insights from the paper in E3S Sciences, 'Talk to Listen' embodies the essence of creating an interactive chatbot for interview preparation. With a primary objective of improving users' interview skills through repeated practice and feedback, the app offers a dynamic platform for mock interviews. By allowing users to rehearse responses in a low-pressure environment, 'Talk to Listen' builds confidence while providing immediate feedback on answers or other aspects crucial for interview success. Moreover, the app's accessibility anytime, anywhere, eliminates the need for physical coaches or scheduling constraints, widening its reach and ensuring flexibility for users. Additionally, 'Talk to Listen' presents a cost-effective solution compared to traditional methods like hiring trainers or attending workshops, making it an invaluable tool for improving interview skills, especially at scale.

Storytelling and Creativity: Drawing inspiration from research findings that chatbots can effectively support character creation, as highlighted by the Association for Computing Machinery, our app, 'Talk to Listen,' endeavors to provide users with a dynamic and engaging platform for crafting fictional personas. Just as the research underscores the value of progressive manifestation in sparking imagination and allowing characters to emerge organically, 'Talk to Listen' adopts a similar approach, gradually revealing more about the characters over the course of conversational interactions. Through back-and-forth dialogues with virtual characters, users experience a level of engagement and enjoyment that surpasses static character creation tools. The interactive nature of the app not only aids users in developing more nuanced and well-rounded characters but also fosters a sense of connection and creativity. By leveraging conversational AI technology, 'Talk to Listen' aims to empower users in their creative endeavors, offering a compelling medium for character development that transcends traditional methods.

What is the most innovative part of the project?

The most innovative part of the project is the integration of the voice activity detection (VAD) feature, which allows users to interact with characters using their voice. This feature enhances the user experience by enabling real-time conversations with characters, making the interactions more engaging and immersive. The VAD feature also provides a hands-free experience, allowing users to talk and listen without having to type or touch the screen. This innovative feature sets 'Talk To Listen' apart from other text-based chat applications, making it a unique and interactive platform for engaging with characters. It is more immersive and engaging than traditional chat applications, you can also talk in a group chat with multiple characters at once.

Microsoft and Microsoft Azure

Talk To Listen is a multimodal mobile app that leverages Microsoft Azure services to deliver a smooth and captivating user interaction.

Azure OpenAI Service - GPT (text): GPT models powers the unique and diverse characters in Talk To Listen. It creates distinct characters with unique traits, voices, and backgrounds. With just a single prompt and some prompt optimization, it can embody a variety of characters and bring them to life. The characters are designed to be engaging, relatable, and interactive, providing users with a rich and immersive experience. In addition to creating characters, Azure OpenAI GPT models also helps users revise their character definitions, generate greetings and greeting messages for the characters, and provides prompts for image generation.
Azure Speech Services (voice): Azure Speech Services provides the text to speech functionality in Talk To Listen. It converts the text generated by the characters into speech, allowing users to listen to the characters' responses. Characters have unique voices that match their personalities, thanks to the availability of multiple voice options in Azure Speech Services. This feature enhances the user experience by making the interactions more engaging and immersive.
Azure OpenAI Service - Dall-e-3 (image): Dall-e-3 is used to generate images for the characters in Talk To Listen. Users can visualize the characters they are interacting with. The images are generated based on the character descriptions and traits provided by the users. This feature adds a visual element to the interactions, making them more engaging and interactive. The images follow Responsible AI practices, ensuring that they are safe and appropriate for all users. The image generation blocks any harmful content like hate speech, harassment, or dangerous information.

GitHub Copilot: GitHub Copilot is an AI-powered code completion tool that helps developers write code faster and more efficiently. This project might be impossible without the help of GitHub Copilot. It not only helped me complete the project but also made the code high-quality, clean, and optimized. It saved me a lot of time and effort, allowing me to focus on other aspects of the project.
Microsoft for Startups Founders Hub: Microsoft for Startups Founders Hub provides resources and many Azure credits to develop and deploy Talk To Listen. It helped me to get started with Azure services and provided the necessary support to build the project. Azure services are very easy to use and all use cases are well documented by Azure.
Azure Services are used in Talk To Listen: Azure Virtual Machine, Azure Application Gateway, Azure Load Balancer, Azure Virtual Network, Azure Network Security Group, Azure Database for PostgreSQL, Azure Blob Storage, Azure CDN (Content Delivery Network), Azure Text To Speech, Azure OpenAI Service.

Technical Architecture

The application's architecture is distributed, with several components interacting to provide the overall functionality. The front-end is built with Expo React Native, Redux, Firebase, and Axios, while the back-end uses FastAPI, SQLAlchemy, Azure, Docker, and other technologies. The data is stored in a PostgreSQL database, and the application uses GitHub Actions, Docker, and Azure services for continuous integration and deployment. It also integrates with third-party APIs for features like voice live streaming and text to speech. The content of the app is powered by Azure OpenAI GPT models, which creates diverse characters with unique traits, voices, and backgrounds.

Tech Stack

Architecture

1. Infrastructure

Tech: Azure Virtual Machine, Azure Application Gateway, Azure Load Balancer, Azure Virtual Network, Azure Network Security Group.

Azure Application Gateway: A web traffic load balancer that manage traffic to servers. It provides SSL termination, which offloads the encryption and decryption of SSL traffic from web servers, and health probes, which automatically remove unhealthy instances from the rotation. (This service is expensive, so it is likely to be removed in the future.)
Azure Load Balancer: Distributes incoming network traffic across multiple virtual machines to ensure high availability and fault tolerance.
Azure Virtual Network: Connects virtual machines to each other and to other Azure services securely. The virtual machined are only accessible through this internal load balancer.
Azure Network Security Group: Provides network security by filtering inbound and outbound traffic to the virtual machines.

2. Front-end

Tech: Expo React Native (JavaScript), Redux, Firebase, Axios, Expo Update.
GitHub

Expo React Native (JavaScript): A cross-platform framework for building mobile applications using JavaScript and React. It allows developers to write code once and deploy it on both iOS and Android platforms.
Expo Update: Service that allows over-the-air updates for Talk To Listen. The app can be updated immediately without going through the app store. Any bugs or issues can be fixed quickly and efficiently.
Redux: A state management library that helps manage the application's state in a predictable way.
Firebase: Provide secure authentication for users and store data in real-time.
Axios: A promise-based HTTP client that makes it easy to send asynchronous HTTP requests to the backend server.

3. Back-end

Tech: FastAPI(Python), SQLAlchemy, Firebase, Docker, Nginx, Gunicorn, Alembic, Pydantic, Pytest, RESTful APIs, Azure Virtual Machine.
GitHub
API Documentation

FastAPI( Python): Modern, fast (high-performance) Python framework for building APIs.
RESTful APIs: The backend services expose RESTful APIs that the frontend can consume to interact with the application.
Azure Virtual Machines: Multiple virtual machines are used to host the backend services. The virtual machines are duplicated to ensure high availability and fault tolerance. Talk To Listen uses Azure Virtual Machines to host the backend services, and always has more than one instance running to ensure that the application is always available.
SQLAlchemy: A Python SQL toolkit and Object-Relational Mapping (ORM) library that provides a set of high-level APIs for working with databases.
Firebase: Provides secure authentication with frontend and backend services. Only allowed users can access the application.
Docker: The backend services are containerized using Docker to ensure consistency and portability across different environments.
PyTest: All backend services are tested using PyTest to ensure that they work as expected.

4. Database

Tech: PostgreSQL, Azure Database for PostgreSQL, Azure Blob Storage, Azure CDN (Content Delivery Network).

PostgreSQL: Talk To Listen uses PostgreSQL as the primary database to store user data, character information, and other application data.
Azure Database for PostgreSQL: A fully managed database service that provides high availability, scalability, and security for PostgreSQL databases.
Azure Blob Storage: Used to store large amounts of unstructured data, such as images, audio files, and other media files.
Azure CDN (Content Delivery Network): The Azure CDN is used to cache static content, such as images and media files, to improve performance and reduce latency for users.

Design

The database schema is designed to store user data, character information, and other application data in a structured and efficient manner.
Entity-Relationship Diagram: The database schema is designed using an Entity-Relationship Diagram (ERD) to visualize the relationships between different entities and attributes.

UML Diagram: The database schema is designed using a Unified Modeling Language (UML) diagram to visualize the classes, attributes, and relationships between different entities.

5. Continuous Integration/Continuous Deployment

Tech: Git/GitHub, GitHub Actions, Docker, Azure Virtual Machine

Git/GitHub: The source code is stored in GitHub repositories for version control and collaboration.
GitHub Actions: Used for continuous integration and continuous deployment (CI/CD) to automate the build, test, and deployment processes for the backend services.
Docker: The backend services are containerized using Docker to ensure consistency and portability across different environments.
Azure Virtual Machine: The backend services are deployed on Azure Virtual Machines using Docker containers.

6. Security

User's data and privacy are of utmost importance. The application uses various security measures to ensure that user data is protected and secure.

SSL/TLS: The backend services use SSL/TLS to encrypt data in transit and ensure secure communication between the frontend and backend.
Firebase Authentication: Provides secure authentication for users and ensures that only authorized users can access the application.
Azure Network Security Group: Filters inbound and outbound traffic to the virtual machines to provide network security.
Delete User Data: Users can delete their account at any time, and all data is deleted from the database and storage.

7. Third-party APIs

Voice Live Streaming: Deepgram
Text To Speech: EleventLabs

8. Responsible AI

Talk To Listen made sure to incorporate Responsible AI practices in our project for the Microsoft Generative AI Hackathon. The hackathon rules and judging criteria focused a lot on developing AI systems in an ethical and responsible way. This meant I designed my project to make sure the outputs from GPT, and Dall-e 3 were safe and responsible. I used filters to block any harmful content like hate speech, harassment, or dangerous information. The image generation also had these safeguards in place. In the app, user can report any inappropriate content, users, or characters. This feedback is used to improve the filters and make the app safer for everyone. Overall, the project platform was safe and transparent, clearly informing users that it was using AI to generate content. Talk To Listen even made it suitable for use by kids and people of all ages. The goal was to create an AI-powered application that followed best practices for Responsible AI.

9. Future Enhancements

Voice cloning Support: I'm testing an open-source voice cloning model to allow users to clone their voice and use it in the app. This will make the conversations more personal and engaging.
Lock Screen Support: I'm working on adding lock screen support, which will allow users to interact with the app even when their screen is locked. This feature will enhance the app's usability on the go, save battery life, and provide convenience for users who use earphones.
Tuning GPT: I'm working on tuning the GPT model to let the characters talk more naturally and life-like.

Built With

azure
azure-ai
dall-e-3
expo.io
firebase
javascript
python
react-native
restful-api

Updates

Hieu Nguyen started this project — May 03, 2024 01:57 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.