Inspiration

In a world increasingly driven by automation and convenience, we wanted to create a tool that brings futuristic desktop interaction to life—hands-free, secure, and intelligent. The idea behind EchoDesk was to combine facial authentication with voice control to simplify daily computer tasks, making computing more accessible and efficient for everyone.

What it does

EchoDesk is a web-based voice-controlled desktop assistant that authenticates users through facial recognition. Once authenticated, users can:

  • Launch apps or websites via voice commands
  • Search across platforms like Google, YouTube, and Bing
  • Simulate message sending through an interactive voice workflow
  • Log all interactions in a real-time activity feed
  • Stay authenticated across sessions using local storage

How we built it

We used React.js as the core frontend framework. For face authentication, we integrated face-api.js, allowing for on-device face detection and matching. Voice control was achieved using annyang, a lightweight JavaScript voice recognition library. We also used react-webcam for live camera access and localStorage for session persistence. The entire app runs in the browser without the need for backend servers, ensuring privacy and simplicity.

Challenges we ran into

  • Fine-tuning facial recognition to be responsive and consistent across lighting conditions
  • Handling natural language variations and unexpected voice inputs
  • Managing authentication state persistence across sessions while keeping it secure
  • Building a smooth user flow for interactive voice commands (like sending messages or choosing platforms)

Accomplishments that we're proud of

  • Built a fully functional desktop assistant with face authentication and real-time voice control—all in the browser
  • Created an engaging and accessible user experience with minimal setup
  • Achieved local-only processing to protect user data and enhance privacy

What we learned

  • The power of combining multiple web APIs like speech recognition, facial detection, and webcam access in one seamless app
  • The challenges of working with real-time user input (voice & face) and making the UI responsive to it
  • Best practices for creating accessible, privacy-focused web apps without relying on a backend

What's next for EchoDesk

  • Integrate with real messaging APIs (e.g., WhatsApp, Telegram) for real message delivery
  • Add liveness detection for stronger face authentication security
  • Package the app with Electron.js to create a native desktop experience
  • Add support for background listening and multitasking
  • Implement multi-language voice recognition and user profiles for personalized experiences

Built With

Share this project:

Updates