SpeakTeX: From Voice to LaTeX
Inspiration
The process of writing mathematical equations for assignments and research papers and documentation has become a constant struggle for computer science students who must deal with LaTeX syntax. The process of writing complex mathematical expressions through typing became a persistent source of frustration because it required unnatural typing when speaking naturally would be more efficient. The development of voice-to-text technology reached advanced levels yet no specific tool existed to convert spoken math into LaTeX code. Our goal was to develop a system which would transform mathematical notation into a natural speaking process to eliminate the gap between thinking and mathematical writing.
What it does
The web application SpeakTeX converts spoken math expressions into LaTeX code which it displays immediately after processing. Users achieve perfect mathematical notation through the application by speaking naturally into their microphone while the system handles all operations from audio recording to typesetting. The system records high-quality audio before sending it to AWS S3 for storage and then uses AWS Transcribe to convert speech into text before Google Gemini 2.5 Flash generates accurate LaTeX code. The application displays the processed output through MathJax while users can select the code for copying or save it to their history. The application provides LaTeX equation writing capabilities to all users who want to create precise mathematical typesetting without needing to learn complex syntax.
How we built it
The development of SpeakTeX involved creating a modern serverless system which used React 19.1.1 and Vite for frontend development to achieve a responsive interface with real-time audio recording functionality. The system operates through AWS Lambda functions for serverless computing and uses AWS S3 for audio file storage and DynamoDB for user history management. The system uses AWS Transcribe for accurate speech-to-text processing and Google Gemini 2.5 Flash API for generating LaTeX code. The system uses MathJax for mathematical notation rendering and the MediaRecorder API in browsers to record WebM audio files with Opus codec for maximum quality and compatibility.
Challenges we ran into
The main difficulty we encountered involved transforming unclear spoken mathematical statements into exact LaTeX code. The system needed extensive Gemini prompt engineering to achieve consistent output because "x squared" could be interpreted differently based on context. The system required optimization of its processing pipeline because each request took between 30 to 60 seconds to complete. The system achieved faster processing through parallel operations and result storage and instant user interface updates. The system required precise error management and retry mechanisms to handle mathematical notation edge cases and maintain service coordination between AWS components and audio playback across different web browsers.
Accomplishments that we're proud of
The system enables users to transform spoken words into mathematical notation through a complete process that accurately handles complex expressions including integrals and matrices and multi-line equations. The application addresses a fundamental challenge for students and researchers and academics who need to work with mathematical notation by providing LaTeX functionality regardless of their typing abilities or syntax understanding. The system operates on a scalable serverless framework which maintains performance during simultaneous user connections through optimized AWS service deployment for cost efficiency and system reliability. The application provides an easy-to-use interface which offers immediate feedback and accessibility features that enable all users to work with the tool effectively.
What we learned
The project taught us about speech processing methods and audio recording techniques and transcription system operations including support for various audio formats and browser compatibility problems. We developed expertise in AI prompt engineering through which we learned to design instructions and manage special cases to produce reliable high-quality results. The project project provided us with practical experience working with contemporary cloud infrastructure and AWS service integration and serverless design approaches. Our development work incorporated contemporary React programming techniques which included hooks and state management and responsive design elements to create an accessible user interface. The team created effective error management approaches while mastering performance optimization techniques for real-time system operations.
What's next for SpeakTeX
The platform will receive additional support for advanced mathematical content including chemical formulas and statistical notation and it will introduce multi-language functionality for mathematical expression input in various languages. The system will receive voice command functionality for LaTeX code editing and real-time collaborative editing features for multiple users working on one document. The development of native mobile applications for iOS and Android systems will enable offline functionality and enhance mobile user interaction. The platform will integrate with Overleaf and academic writing platforms to enhance workflow efficiency. The tool will become ready for institutional use through its enterprise features which include team management and shared libraries while its advanced AI capabilities will deliver context-sensitive suggestions and automated equation verification to enhance user experience.

Log in or sign up for Devpost to join the conversation.