SignBridge

Inspiration

The median deaf high school graduate in America reads at a 4th grade level. One in five reads below 2nd grade. In developing countries, deaf illiteracy exceeds 75%.

This isn't about intelligence. Sign language is a complete, complex language—but it has no written form. For 72 million deaf individuals worldwide, written text is essentially a foreign language they've never heard spoken.

Yet the entire accessibility industry assumes deaf people can read captions.

We built SignBridge because captions fail the people who need accessibility most. News, politics, healthcare, education—the content deaf users say they MOST need—is filled with jargon and complexity that breaks both auto-captions AND reading comprehension. A 4th-grade reading level cannot parse Supreme Court decisions, pandemic health guidance, or breaking news about natural disasters.

The problem isn't that deaf people can't read. It's that we're forcing them to.

What it does

SignBridge is a text-to-sign-language platform that converts any text into realistic sign language videos using AI-powered 3D avatars.

Core Capabilities

Text → Sign Language Translation
- Input any text (news scripts, captions, transcripts)
- Output professional sign language video
- Supports Indian Sign Language (ISL) with architecture for ASL, BSL, and 300+ sign languages
Real-Time Avatar Rendering
- SMPL-X body model with anatomically accurate hand articulation
- Physics-based motion (Hermite splines, anticipatory movement)
- Natural signing flow—not robotic interpolation
Production-Ready Video Generation
- Broadcast-quality output (720p+, 30fps)
- TikTok-style synchronized captions
- Automated pipeline: text in → stacked video out
Web Interface
- Live demo mode with simulated news broadcast
- Text input mode for any content
- Adjustable signing speed

Demo Features

4,000+ motion sequences from WLASL sign language dataset
150+ word vocabulary with automatic fingerspelling fallback
3 motion engines: Natural, Professional, and Anticipatory
End-to-end pipeline: Text → NLP → Gloss mapping → Motion loading → GPU rendering → Video export

How we built it

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                         TEXT INPUT LAYER                            │
│   English, Hindi, Spanish (extensible to any language)              │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      NLP PROCESSING LAYER                           │
│  ┌──────────────┐  ┌──────────────┐  ┌─────────────────────────┐   │
│  │  Tokenizer   │→ │ Gloss Mapper │→ │   Semantic Matcher      │   │
│  │  (spaCy)     │  │ (Dictionary) │  │   (Fallback/Synonyms)   │   │
│  └──────────────┘  └──────────────┘  └─────────────────────────┘   │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    MOTION GENERATION LAYER                          │
│  ┌──────────────┐  ┌──────────────┐  ┌─────────────────────────┐   │
│  │Motion Loader │→ │ SLERP Interp │→ │   Physics Engine        │   │
│  │ (SMPL-X)     │  │ (Quaternions)│  │   (Splines/Momentum)    │   │
│  └──────────────┘  └──────────────┘  └─────────────────────────┘   │
└─────────────────────────────┬───────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      RENDERING LAYER                                │
│  ┌──────────────┐  ┌──────────────┐  ┌─────────────────────────┐   │
│  │SMPLX Renderer│→ │Caption Gen   │→ │   Video Compositor      │   │
│  │ (PyRender)   │  │ (Pillow)     │  │   (FFmpeg/MoviePy)      │   │
│  └──────────────┘  └──────────────┘  └─────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Technical Stack

Layer	Technology	Purpose
Backend API	Flask 3.0, Flask-CORS	REST endpoints for translation
NLP	spaCy, custom tokenizer	Text processing & lemmatization
Motion Data	SMPL-X, WLASL dataset	4,000+ sign motion sequences
Rendering	PyRender, PyTorch, trimesh	GPU-accelerated 3D rendering
Motion Quality	SciPy (SLERP), NumPy	Quaternion interpolation
Video	FFmpeg, MoviePy, Pillow	Encoding & composition
Frontend	React 18, Vite, CWASA	Web interface & avatar

Key Technical Decisions

1. Modular Architecture Each layer is independent. Adding a new sign language requires only:

New gloss mappings (dictionary.json)
New motion data (SMPL-X pickle files)
Zero code changes to core pipeline

2. Physics-Based Motion We implemented 3 motion engines to achieve natural signing:

Natural Motion: Easing functions + coarticulation
Professional Motion: Cubic Hermite splines for C1-continuous paths
Anticipatory Motion: Look-ahead blending (signers prepare for next sign during current sign)

3. SMPL-X Body Model

182 pose parameters per frame
21 body joints + 15 joints per hand
Anatomically accurate finger articulation critical for sign language

4. Dual Rendering Paths

GPU Path: PyRender for high-quality offline video
Web Path: CWASA/Three.js for real-time browser playback

Code Architecture

SignBridge/
├── backend/
│   ├── app.py                    # Flask REST API
│   ├── nlp/
│   │   ├── tokenizer.py          # spaCy/regex tokenization
│   │   └── gloss_mapper.py       # Word → sign gloss mapping
│   ├── sigml/
│   │   ├── generator.py          # SIGML XML generation
│   │   └── combiner.py           # Multi-sign concatenation
│   ├── motion_loader.py          # SMPL-X motion data
│   ├── smplx_renderer.py         # GPU rendering pipeline
│   ├── natural_motion.py         # Easing & coarticulation
│   ├── professional_motion.py    # Hermite splines
│   ├── anticipatory_motion.py    # Look-ahead motion
│   └── gloss_matcher.py          # Semantic fallback matching
├── frontend/
│   ├── src/App.jsx               # React main component
│   └── src/components/           # UI components
├── video_generator.py            # End-to-end pipeline
├── caption_stacker.py            # Caption overlay
└── sync_and_stack.py             # Video composition

Challenges we ran into

1. Motion Quality

Problem: Naive interpolation between sign poses looks robotic.

Solution: We implemented 3 motion engines:

SLERP interpolation for rotation parameters
Cubic Hermite splines for smooth velocity
Anticipatory motion that mimics how real signers prepare for the next sign

2. Hand Articulation

Problem: Sign language depends on precise finger positions. Generic avatars lack hand detail.

Solution: We use SMPL-X model with 30 hand joints (15 per hand), loading motion data from the WLASL sign language dataset which captures real signer movements.

3. Vocabulary Coverage

Problem: No dictionary covers all words.

Solution: Multi-level fallback system:

Exact match in gloss dictionary
Synonym/semantic matching
Prefix/stem matching (WATCHING → WATCH)
Automatic fingerspelling for unknown words

4. Video Synchronization

Problem: Avatar video and caption video had different durations.

Solution: Built sync_and_stack.py that:

Extracts duration from both videos
Time-stretches both to mean duration
Stacks vertically with ffmpeg vstack filter

5. Real-Time vs. Quality Trade-off

Problem: High-quality GPU rendering is slow; web rendering lacks quality.

Solution: Dual rendering paths:

CWASA for real-time web demos
PyRender for production video export
Same gloss/motion data feeds both

Accomplishments that we're proud of

Technical Accomplishments

End-to-End Working Pipeline
- Text input → Sign language video output in single command
- 70+ demo videos generated during development
- Production-ready quality
Physics-Based Motion Engine
- Anticipatory motion: Avatar prepares for next sign during current sign
- Natural-looking signing that doesn't look robotic
- 3 motion engines with different quality/speed trade-offs
Scalable Architecture
- Adding new sign language = new data files, not new code
- Modular layers: NLP, motion, rendering are independent
- Same codebase can serve ISL, ASL, BSL with config changes
Real Dataset Integration
- 4,000+ motion sequences from WLASL
- SMPL-X body model for anatomical accuracy
- Real sign language data, not synthesized animations

Business Accomplishments

Clear Market Entry Strategy
- First customer identified: Living India News (Punjabi channel)
- Regulatory tailwind: RPWD Act 2016 enforcement accelerating
- 155 organizations fined for accessibility violations (Feb 2025)
Data Moat Strategy
- Every customer expands vocabulary database
- Regional dialects no competitor will have
- First-mover builds the corpus

What we learned

Technical Learnings

Sign language is NOT "animated captions"
- Different grammar, different word order
- Facial expressions carry grammatical information
- Regional dialects vary significantly
Motion quality matters more than vocabulary size
- 50 natural-looking signs > 500 robotic signs
- Users can tolerate fingerspelling unknown words
- Users cannot tolerate unnatural movement
SMPL-X is essential for sign language
- Generic avatars lack hand articulation
- 15 joints per hand captures finger positions
- Body model + motion data = realistic signing

Business Learnings

Compliance is the entry point, not the product
- Regulatory pressure creates urgency
- But the real value is serving users captions fail
- Data accumulated from compliance becomes the moat
The literacy gap is underappreciated
- Most people assume deaf = can read
- 4th grade reading level changes everything
- Complex content (news, health, legal) is inaccessible

What's next for SignBridge

Immediate (Post-Hackathon)

Priority	Action	Timeline
1	Living India News pilot outreach	Week 1
2	Expand vocabulary to 500+ words	Month 1
3	Add facial expressions (grammatical markers)	Month 2
4	ISLRTC partnership for vocabulary validation	Month 2

Phase 2: Indian Market (6-24 months)

50+ Indian news network contracts
Government partnerships (Doordarshan, state broadcasters)
Regional vocabulary expansion (Tamil, Telugu, Bengali ISL variants)
Target: Rs 5-10 Cr ARR

Phase 3: Global Expansion (Year 2-4)

ASL, BSL, Auslan support
International news networks (BBC, Al Jazeera)
Streaming platforms (Netflix, Disney+)
Target: $5-10M ARR

Phase 4: Creator Economy (Year 4+)

YouTube/Twitch API integrations
Creator tools ($29-99/month)
Community vocabulary contributions
Target: $50M+ ARR

The Vision

"We're not building a compliance tool. We're building the Google Translate for sign language. Every customer adds to our vocabulary database. By year 3, we'll have the world's largest corpus of regional sign language variations—a data asset that transforms us from a compliance vendor into the infrastructure layer for global sign language accessibility."

Built With

axios
claude-api
css
cwasa
ffmpeg
flask
git
html
javascript
json
node.js
numpy
pillow
puppeteer
python
pytorch
react
sigml
smpl-x
spacy
three.js
trimesh
vite

Submitted to

13Hacks

Created by

*Aditya Joshi* led product development and go-to-market strategy, building the core SignBridge platform and pitch materials.
*Sifat Singh Khalsa* delivered the live product demonstration to judges.
*Sahibdeep Singh*contributed to visual design and presentation assets.

aditya joshi
Sifat Khalsa

Updates

aditya joshi started this project — Jan 25, 2026 09:45 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.