"TrackSpeak: Empowering Rail Madad with Eyes and Ears - A Revolutionary Audio-Visual Complaint Reporting System for Indian Railways powered by Gemini Pro"

🚀 Before you dive in, Unleash the full potential of our app! Check out our hands-on video demo on YouTube and see how easy and effective it is.

Struggles of Arjun: The Inspiration Behind Our Application

On a typical hot day in Mumbai, Arjun, a young software engineer, was all set for his journey back to Pune. As he settled into his seat on the train, he quickly realized the air conditioning was malfunctioning, making the compartment stifling. Eager to get this fixed, Arjun pulled out his phone and opened the RailMadad app, hoping for a swift resolution.

Picture Not Available

Navigating the app, he tried to submit his complaint but found the process unexpectedly cumbersome. The interface didn't support Marathi (regional language of Mumbai city), his preferred language, forcing him to use English, in which he was less fluent. As he struggled with the language and searched through complex menus to categorize his complaint correctly, Arjun grew increasingly frustrated. Each step seemed to require more details, and the app kept asking for information that Arjun thought was unnecessary just to report a faulty air conditioner.

Despite his best efforts, by the time he finally submitted the complaint, nearly half an hour had passed, and Arjun was left feeling hotter and more bothered than when he started. This experience left him questioning the effectiveness of a system meant to aid passengers when they needed it most.

Introduction

In the dynamic landscape of railway services, where India boasts the fourth largest railway network operating over 22,593 trains daily with a passenger count of 23 million, customer feedback is essential for continuous improvement and effective grievance resolution. Recognizing this, we have come up with a innovative solution for the Centre for Railway Information Systems (CRIS) to revamp how railway complaints are processed. This article explores a breakthrough project that integrates advanced technologies to handle customer complaints submitted via video and voice, thereby enhancing the efficiency and effectiveness of the complaint management system. Embracing technological advancements, we have introduced a cutting-edge solution that processes passenger grievances efficiently amidst the diverse linguistic landscape of India. This initiative leverages Google Gemini Pro 1.5, an advanced Artificial Intelligence (AI) to transform voice / video based complaints into actionable insights categorizing them for swift resolution.

The challenges with the existing 'RailMadad' application

Filing complaints through the RailMadad web application can be a daunting task for many passengers. The process, often criticized for being cumbersome and non-intuitive, can deter even the most tech-savvy users. This difficulty is compounded by the platform's inability to effectively cater to the linguistic diversity of India, potentially leading to miscommunications and delays in the processing of grievances. Moreover, the requirement to navigate through multiple steps and input extensive details can make the process time-consuming, discouraging passengers from reporting issues. With the railway system serving millions daily, addressing these complaints quickly and accurately was becoming increasingly critical. The solution needed to be robust, handling multiple languages and integrating seamlessly with existing systems to offer a streamlined user experience. Among the issues faced by the current CRIS system are several significant challenges. Firstly, the current portal requires users to type their complaints, posing a substantial barrier for those who are inexperienced or uncomfortable with technology. Secondly, the portal primarily supports English, which can alienate speakers of other Indian vernacular languages, making it difficult for them to file and voice complaints effectively. Lastly, the process of navigating the system is complex as users must manually select appropriate categories and subcategories for their complaints, which can be confusing and time-consuming, particularly for those not familiar with digital interfaces.

The challenges with the RailMadad web application extend beyond the customer experience to the administrative side of operations. For the staff managing the complaint system, the process of addressing each complaint individually is highly time-consuming. Administrators are required to manually review, verify, and categorize each submission before routing it to the appropriate team for resolution. This not only slows down the response time but also increases the likelihood of errors, making it a cumbersome task to ensure that each complaint is handled efficiently and accurately.

Our Innovative Solution: A Unified Voice and Video Complaint Portal

In our solution, we prioritize a seamless and intuitive user experience. Upon accessing our website, users begin by scanning their tickets. Leveraging Google Gemini Pro 1.5, our system automatically detects and fills in the Passenger Name Record (PNR) number from the ticket. This PNR contains crucial information such as passenger details, journey specifics, and seat number, streamlining the complaint process.

Voice-Based Complaint Submission [Usage of Gemini Pro 1.5]

Should users opt to submit their complaint via voice, they can simply click the microphone button and speak in their native language. For instance, in our demonstration, we used Hindi spoken in a typical railway station environment replete with background noise. Google Gemini Pro 1.5 processes this audio to extract a transcript. The system then analyses the transcript to categorize the complaint into predefined categories and sub-categories. Once categorized, the complaint form is auto-filled, allowing the passenger to review and submit it with ease. The submitted complaint is then routed to the backend administrative panel, where it is displayed with priority levels. The AI extracts relevant entities, such as specific location names from the complaint, linking directly to the railways database for swift administrative action, such as alerting the Railway Police Force in the relevant division.

Video-Based Complaint Submission [Usage of Gemini Pro 1.5]

For video complaints, the process mirrors that of voice submissions. Users record and submit their complaints as video files. Google Gemini Pro 1.5 performs an analysis at the video frame level, cross-referencing the visual content with the audio transcript to ensure consistency and accuracy. If the alignment between what is spoken and what is shown in the video exceeds the threshold of 70%, the system confirms the validity of the complaint and suggests appropriate actions for administrators to take, such as forwarding the issue to the correct department or authority for immediate action.

Both methods are designed to ensure that complaints are not only easy to submit but are also accurately categorized and swiftly acted upon, enhancing the efficiency of complaint resolution and overall passenger satisfaction.

Process Flow diagram - Highlighting the usage of GEMINI PRO 1.5

The Process flow diagrams of both Passenger side and Admin side (Complaint management system) has been posted in the Project Media section, kindly refer (Fig 5 and Fig 6)

Behind the Scenes: How we built it with Gemini Pro 1.5

Automatic Capture of PNR number from Image (Railway Ticket):

We used Gemini Pro 1.5 latest to capture the PNR number from the Railway ticket. Snippet code for the same has been mentioned here:


The user can capture / upload the railway ticket and with the help of Gemini we will extract the PNR number from the image and automatically fill it for the user making it simple and seamless.

Transcribing the audio in Indian Native Language and Intent / Entity recognition:

We utilized the latest version of Gemini Pro 1.5 to transcribe a passenger's speech from Hindi, an Indian native language. After obtaining the transcript, we assigned category and sub-category classes to the text. We then used Gemini to classify the content according to these classes. Additionally, we extracted important entities such as names and station names from the complaint. The code snippet for this process is provided below:



Multimodal analysis (Video +Text):

For video submissions, we employ Gemini Pro 1.5 to ensure that the video content aligns with the accompanying transcription. This process verifies the consistency between what the passenger has spoken and the visual content of the video. By doing so, we streamline the review process, enabling administrators to quickly and accurately understand the nature of the complaint. The code snippet for this process is provided below:

Impact of Our Application on Complaint Management

The deployment of our voice-driven complaint management system will decrease the manual labor needed to transcribe and categorize complaints. This automatic AI driven system will quicken the response times, with urgent matters being escalated immediately. Additionally, by delivering analytics on trends in complaints and customer feedback, the system assists the railways in continuously enhancing service quality and passenger satisfaction.

Features of the Application:

The innovative application we've developed offers significant benefits for both passengers and administrative staff, streamlining the complaint management process with its advanced features.

For Passengers

The application simplifies the process of submitting complaints through voice or video. With its multilingual capabilities , it caters to a broad audience in India, allowing users to submit grievances in their preferred language. This inclusivity not only enhances accessibility but also ensures that complaints are expressed clearly and directly. Moreover, the application automatically categorizes each complaint into appropriate buckets. This automation reduces the effort required by passengers to navigate complex categorization systems, making it easier and more efficient for them to report issues.

On the administrative side

the application equips backend staff with powerful AI tools that analyze the intent and key details (entities) within each complaint. This feature enables admins to quickly grasp the core issues, facilitating a faster and more accurate response. Additionally, the AI system suggests relevant actions based on the nature of the complaint, providing administrative staff with actionable insights that can guide their decision-making process. This not only speeds up the resolution of complaints but also enhances the overall effectiveness of the complaint management system, making it a valuable tool for both users and administrators alike.

Enriching Lives: The Community Impact of Our Application

The application promotes inclusivity by allowing complaints to be expressed in multiple Indian languages, thereby removing linguistic barriers and making the complaint process accessible to a wider segment of the population. It accommodates individuals who are inexperienced or have low literacy levels, enabling them to lodge complaints without the need to write, easing the process. This app empowers passengers by providing them with a straightforward and accessible platform to voice their grievances, potentially leading to improved service and accountability from the railways. By streamlining the complaint process, the application reduces bureaucratic overhead and the potential frustrations associated with lodging complaints, contributing to a more positive public transport experience.

Challenges Faced: Enhancing Opportunities for Google Gemini Pro

We would like to share some challenges we encountered during the testing of our application, specifically in the functionality involving Google Gemini Pro. One significant hurdle was extracting the PNR number from railway tickets. We experimented with various ticket images, and noted that Gemini struggled with images that were noisy or blurry, failing to accurately capture the PNR number. Enhancing Gemini's ability to denoise images for clearer information extraction could significantly improve its performance.

Additionally, India's linguistic diversity presents another layer of complexity. With 22 major languages spoken across different regions, Gemini's current capabilities fall short in accurately recognizing and transcribing several prominent languages like Kannada, Bengali, Tamil, and Telugu. While Gemini excels with Indian-accented English and Hindi, particularly in noisy environments, expanding its training to include a broader range of Indian languages would greatly benefit the Indian community and developers looking to integrate this technology into various applications.

Looking Ahead: The Future of Gemini AI in Public Service

As we continue to refine our technology, our aim is not just to simplify the complaint lodging process but to enhance the overall experience of railway passengers in India. This project stands as a testament to the potential of using Gemini Pro 1.5 to transform public service delivery, making it more accessible, efficient, and responsive to the needs of a diverse populace. By harnessing AI(Gemini Pro) for Good, we demonstrate how advanced technology can be used to improve public services and positively impact the daily lives of millions. This initiative underscores our commitment to leveraging AI in ways that are ethical and beneficial, ensuring that technological advancements contribute to the greater good of society.

Share this project:

Updates