Ensoul 万物生灵 — Project Description

Inspiration

In an era of disposable consumerism and digital isolation, we realized we are losing touch with the physical world. We asked ourselves: What if the ancient philosophy of "Animism" (万物有灵) met modern AI?

We didn't want to build another efficiency tool. We wanted to create a "Digital Medium"—a bridge that allows us to converse with the dormant souls of our keepsakes, turning silent history into a resonant dialogue.


What it does

Ensoul is a spiritual sanctuary that transforms physical objects into digital "Spirits".

  1. The Binding Ritual (Photo + Narrative)
    Users take a photo of an object to create a digital vessel. Crucially, they complete a "Resonance Questionnaire" to input their memories and emotional bond. This combination acts as the "Soul Injection." We use Gemini image editing to remove the background and place the object on a unified backdrop, so the spirit has a clean visual vessel.

  2. Spirit Awakening (Archetypes)
    The system analyzes the user's questionnaire inputs and classifies the object into one of 5 Elemental Spirit Archetypes (Metal 金, Wood 木, Water 水, Fire 火, Earth 土) plus a hidden Galaxy (星空) Spirit(隐藏要素). Objects with similar "soul frequencies" may share the same Spirit Avatar. Each archetype has a distinct personality (e.g., Metal: rational, loyal; Wood: innocent, healing; Earth: steady, ancient).

  3. The Spirit Cabinet (Storage)
    A digital exhibition space where these awakened objects are displayed, preserving the connection forever.

  4. Oracle Dialogue (Interaction)
    We designed a "Light Questionnaire + Answer Book" mechanism. Users select their current confusion (e.g., career, love), and the specific Spirit Archetype responds with a single, profound piece of wisdom based on its personality.


How we built it

We built a "User-Defined Animism" architecture powered by Google Gemini:

  • Visual Entry (Gemini Vision)
    We use Gemini 2.5 Flash with multimodal input (image + text) to recognize the object type in the photo, serving as the anchor for the digital entity.

  • Image Editing (Gemini Image Generation)
    We use Gemini 3 Pro (image preview) to remove the photo background and place the object on a unified background, giving each spirit a clean visual vessel.

  • Soul Synthesis (Gemini Text / Reasoning)
    This is the core logic. We feed the user's questionnaire answers and object type into Gemini 2.5 Flash.

    • The model analyzes the emotional bond (e.g., gift, heirloom).
    • It generates a unique backstory (origin story in Chinese).
    • It classifies the object into one of the 5 Elemental archetypes or the hidden Galaxy (星空) archetype.
    • It produces a visual prompt for the spirit avatar (e.g., chibi style, element-based details).
    • The classification can be summarized as: $$f(User_Story, Object_Type) \rightarrow Spirit_Archetype \in {Metal, Wood, Water, Fire, Earth, Galaxy_{(hidden)}}$$
  • The Oracle Logic
    When a user asks for advice, Gemini 2.5 Flash generates a response that strictly adheres to the assigned Spirit Archetype's tone and worldview (Personality + Object Metaphor = Wisdom).

  • Frontend
    Built with Next.js (React) for the Spirit Cabinet display, upload flow, questionnaire, and Oracle interaction.


Gemini tools & models used

Use case Model / capability Description
Object recognition Gemini 2.5 Flash (multimodal) Image + text → short object description
Background removal Gemini 3 Pro (image preview) Image editing: remove background, unified vessel image
Soul synthesis Gemini 2.5 Flash (text) Questionnaire + object → archetype, story, visual prompt
Book of Answers Gemini 2.5 Flash (text) User question + spirit context → one oracle wisdom

Challenges we ran into

  • Defining the Soul: Initially, we tried to generate a unique avatar for every single object, but it felt chaotic and inconsistent. We realized that "Archetypes create stronger connections." We spent time distilling personalities down to 5 Elemental + 1 hidden Galaxy (星空) Spirit types, making the system more robust.

  • Balancing Input vs. Output: Designing the "Resonance Questionnaire" was tricky. It needed to be short enough not to bore the user, but deep enough for Gemini to extract a meaningful "Spirit Profile."

  • The "Oracle" Tone: Tuning Gemini to speak not like a chatbot, but like a mystical "Book of Answers," required extensive prompt engineering to ensure the response followed the formula: $$Personality + Object_Metaphor = Wisdom$$


Accomplishments we're proud of

  • The hidden Galaxy (星空) Spirit(隐藏要素): We successfully implemented logic where specific, rare combinations of user inputs (e.g., very old/heirloom + "Destiny/Spiritual" bond) trigger the hidden Galaxy Spirit, adding gamification and mystery.

  • Emotional mapping: Successfully using Gemini to translate abstract user memories (questionnaire text) into concrete Spirit personalities (element, tone, origin story).


What we learned

  • Context is King: Visuals alone aren't enough. The user's narrative (via the questionnaire) is what truly gives an object its soul.

  • Less is More: Moving from "endless chat" to "one-time Oracle wisdom" made the interaction feel more precious and ritualistic.


What's next for Ensoul 万物生灵

  • Voice interaction: Integrating TTS to give each of the 5 archetypes a distinct voice.
  • AR manifestation: Using AR to view the spirit aura hovering over the real-world object.

Built With

Share this project:

Updates