translate4good
agents

Inspiration

The United Nations Convention to Combat Desertification (UNCCD) faces ongoing challenges in translating technical documents across six official UN languages. Standard translation models struggle with domain-specific terminology and nuanced concepts. UNCCD's extensive archive of professionally translated documents provides a unique opportunity to address this challenge. We envisioned an AI-powered solution that would leverage this data to assist translators in producing accurate, contextually relevant translations, easing their workload and promoting consistency across UN documents. Our goal was to build a tool adaptable to various UN agencies and capable of learning and improving over time.

What it Does

Our AI-powered translation assistant aids UN translators by offering accurate translations tailored to UNCCD’s needs. It fine-tunes translations based on feedback, recognizes domain-specific terminology, and provides an interactive workspace where translators can edit, review, and enhance translations. Key features include:

Real-time translation preview: Translators can view quick translations in real time.
Edit and improve: Users can adjust translations and apply AI suggestions to improve accuracy, coherence, and terminology usage.
Glossary Integration: Highlighted UN terms are integrated to ensure high-quality, domain-specific translations.
Fetch.ai Quality Agents: Ensures tone and consistency with professional UN standards.
Export Functionality: Easily download the translated document.

This tool is designed to streamline the translation workflow, reduce redundancy, and improve accuracy by learning from translators' corrections over time.

How we Built It

We used the MarianMT model, fine-tuned with UN Parallel Corpora, to handle English-Spanish translations accurately, achieving a high Semantic Similarity mean of 92% in test data. Our solution leverages feedback loops where translators’ edits are incorporated for ongoing retraining, enhancing the model's learning and adaptation to specific UN terminology. We integrated Fetch.ai agents to ensure quality control by checking tone, accuracy, and alignment with UN standards.

The front-end UI is interactive, providing an intuitive experience for translators to edit, view AI suggestions, and highlight key terminology. Currently, we are working on embedding a UN glossary directly into the interface for easy reference.

UN Translation System Agents

A specialized system of agents that work in sequence to process UN document translations accurately:

1. Document Classifier Agent

Purpose: First-pass analysis of documents to determine type (technical, legal, policy)
Input: Raw text in source language
Output: Document classification with metadata

2. Terminology Extractor Agent

Purpose: Identifies and extracts UN-specific terminology and acronyms
Input: Classified document
Output: Technical terms, phrases, and acronyms for preservation

3. Translation Agent

Purpose: Performs intelligent translation while maintaining UN standards
Input: Text with marked terminology
Output: Translated content with preserved terms

4. Quality Validator Agent

Purpose: Ensures translation meets UN quality standards
Input: Translated document
Output: Quality assessment and improvement suggestions

Challenges We Ran Into

Throughout development, we faced significant runtime, compute resource, and memory management challenges, especially with large translation samples. Managing adaptive retraining while maintaining low latency proved complex, requiring optimization at various stages of the model and UI development.

Accomplishments that we are proud of

The project successfully met all quality benchmarks, and in blind tests with native Spanish speakers and ChatGPT, our model outperformed general translation solutions in terms of clarity and coherence. We are also proud of the adaptive learning functionality that enables the system to evolve from translator feedback, ensuring an increasingly accurate and efficient workflow.

What I Learned

Developing this solution deepened our understanding of adaptive machine translation in specialized domains and the integration of user feedback loops to enhance model performance. We gained experience in optimizing resource usage for large-scale translation tasks and incorporating semantic evaluation metrics like the Translation Edit Rate (TER) to measure translation quality in real-world scenarios. This project has taught us the importance of user-focused design in building practical AI tools that assist professionals effectively.

GitHub Repository: UN Translation System

Built With

fastapi
huggingface
javascript
python
react
transformers
typescript

Submitted to

Hack for Social Impact 2024
- Winner Fetch.ai Technology Prizes: Environmental Change Prize

Created by

Worked on frontend and backend and fetch.ai agents.

Rajashekar Vennavelli
DS & NLP @ UC Berkeley, CS @ Santa Clara University
ML - Training

Alma Gashi
Alice Zhu

Updates

Rajashekar Vennavelli started this project — Nov 10, 2024 07:09 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.