Light Stick — AI-Powered Short Video Recommendation Platform

Inspiration

We believe user data is safest when it never leaves the device. Most recommendation systems require centralized data collection, which raises serious privacy concerns.

We wanted to explore federated learning in a real-world scenario: short-video recommendation.

Our solution enables us to train on real user data without any knowledge of the actual user data.

Our goal:

  • Provide personalized video recommendations.
  • Keep raw user interactions local on the device.
  • Maintain an engaging, TikTok-style feed.

Our revolutionary idea:
Combine on-device machine learning, federated aggregation, differential privacy techniques, and a smooth mobile experience in a single AI-first platform. We designed the infrastructure for privacy-first AI, ensuring personalized recommendations without ever exposing raw user data.


What We Built

Frontend

Built with React Native + Expo, the app delivers a polished mobile-first experience:

  • Vertical paged video feed with snap scrolling and auto-play, continuously streaming videos via our recommendation API.
  • Gesture-based interactions: double-tap to like, swipe navigation.
  • Comments system using a bottom sheet interface.
  • User against video embeddings & trust graph visualization using React Native SVG, demonstrating federated learning in action.
  • Lightweight state management via Zustand and animations with React Native Reanimated v3.

Backend

Powered by FastAPI, the backend enables:

  • Federated learning aggregation with trust-weighted updates.
  • Real-time recommendation endpoints and analytics.
  • Video preprocessing: compression, frame extraction, CLIP-based embeddings.
  • Supabase integration for storage, metadata, and real-time updates.
  • Async storage for user embeddings and interaction data (raw data never leaves the device!)

Machine Learning Pipeline

We implemented federated learning with trust and privacy at the core:

  1. Local Model Training (on-device)

    • Each device trains a small Binary MLP head on own user interactions: \[ \text{MLP: } f(x) = \sigma(W_2 \\cdot \text{ReLU}(W_1 x + b_1) + b_2) \]
      • Input: concatenated user embedding + video embedding
      • Two Dense hidden layers with ReLU
      • Output layer: sigmoid predicting engagement probability
      • Each Dense layer has weights (matrix) + bias (vector) trained locally
    • User embeddings dynamically evolve as the user interacts, continuously updating both local and global models in real time.
    • (In this prototype, /train executes via a server-side /local route to simulate on-device updates. The architecture is designed so that training could run directly on each device in a full implementation.)
  2. Trust-Weighted Aggregation

    • Server maintains a trust graph (inspired by a recent Google Research paper [1]):
      • Nodes = devices with trust levels \(t \in [0,1]\)
      • Edges = relational trust, controlling influence of updates
    • Aggregation formula: \[ W_\text{global} = \frac{\sum_i t_i \cdot W_i}{\sum_i t_i} \]
    • This ensures trusted contributions influence the global model more than noisy or potentially malicious updates.
  3. Differential Privacy

    • Weight clipping: \[ ||w||_2 \le 1.5 \]
    • Gaussian noise addition: \[ w \leftarrow w + \mathcal{N}(0, \sigma^2), \quad \sigma = 0.1 \]
    • Random subsetting: \[ S \subseteq W, \quad |S| = 0.9 \, |W| \] (only 90% of weights are transmitted per round)
    • Guarantees mathematical anonymity: global model updates cannot be traced back to individual users.
  4. Personalized Recommendations

    • Small Binary MLP head performs on-device inference and training.
    • Main server handles heavier processing (e.g., video → embedding conversion).
    • Devices communicate updates in rounds, securely aggregated via differential privacy.
    • Trust graph ensures malicious or noisy nodes are weighted down over time, while trusted nodes retain higher influence, with contributions gradually decaying to reflect reliability and defend against model inference attacks.

Key Innovation: personalized recommendations are generated entirely on-device, with trust-weighted aggregation + differential privacy providing robust, secure global learning.


Why It's Revolutionary

  • Privacy-First AI: raw data never leaves the device.
  • Federated Learning in Action: real-time personalization without central data collection.
  • Trust Graph & Differential Privacy: resilient against model inversion, poisoning, or malicious updates.
  • TikTok-Level Engagement: smooth, swipeable video feed with high user interaction.

Challenges We Overcame

  • Ensuring data consistency across frontend (JS), backend (Python/NumPy), and database (PostgreSQL).
  • Handling cold-start users without generating poor recommendations.
  • Managing asynchronous flows: interactions → rebuild user embedding → fetch recommendations.
  • Maintaining embedding dimensionality across the pipeline (\(16\)-dimensional user vectors vs video embeddings).

Lessons Learned

  • Most practical ML engineering challenges are about data consistency, not model complexity.
  • On-device training with TensorFlow.js is feasible but requires careful memory management.
  • Real-time testing and logging across mobile ↔ backend pipelines are critical.
  • Federated learning with trust-weighted aggregation + differential privacy can produce robust, private personalization.
  • Privacy-first federated AI has huge potential for secure, real-world applications.

Demo Impact

During the hackathon, Light Stick:

  • Ran a fully working TikTok-style feed with live personalization.
  • Showed users upcoming video feed updating in real time.
  • Demonstrated that privacy-first AI is feasible, performant, and highly engaging.
  • Supported multiple users simultaneously, with each device's updates syncing to the trust graph in real time, visually reflecting contributions and trust decay across the network.

Future Vision

  • Advanced on-device intelligence for predictive recommendations.
  • Collaborative federated learning across multiple apps while preserving privacy.
  • Integration with vector databases for semantic video understanding.
  • Scaling globally while keeping user data completely private.

Light Stick demonstrates that privacy-first, federated AI can power real-time, personalized experiences, making centralized data collection obsolete.

References

[1] Google Research. Differential privacy on trust graphs, 2025. https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2025.53

Built With

Share this project:

Updates