Teth.AR

Inspiration

Remote communication tools work well for conversation but perform poorly when used for physical instruction. During equipment repair, assembly work, or technical training, experts often guide workers through video calls. Instructions such as “move the connector slightly to the left” or “attach the bracket behind the panel” rely on verbal descriptions of spatial relationships. Without a shared spatial reference, the worker must interpret those instructions through trial and error.

Training for physical work depends on repeated action. Motor learning develops through direct interaction with tools and components. Demonstrations performed inside the workspace allow trainees to observe movement, orientation, and sequencing. Repetition of these actions forms procedural memory, often described as muscle memory, which supports long-term skill acquisition.

Teth.AR explores whether remote instruction can take place inside a shared spatial environment where a trainer demonstrates actions directly in the worker’s view. The project focuses on how mixed reality can support remote training across settings where people learn through physical interaction.

What It Does

Teth.AR is a mixed reality system for remote training and technical guidance.

A trainee works inside a passthrough mixed reality workspace. A remote trainer joins the session through a virtual environment that represents the same workspace. Both users interact with the same digital objects and assembly steps through a synchronized session.

The trainee sees the real environment through a passthrough video with digital guidance placed inside the workspace. The trainer manipulates components and demonstrates procedures from the virtual environment. The trainee observes those actions positioned directly within the work area.

The system supports training scenarios that require spatial instruction and repeated physical interaction. These include equipment maintenance, manufacturing assembly, warehouse operations, field service repair, and healthcare procedure training.

Capabilities

Passthrough mixed reality workspace for the trainee
Virtual reality interface for the trainer
Synchronized object manipulation during assembly tasks
Spatial anchors for stable object placement
Avatar motion representing trainer gestures
Session recording for replay and review

Interaction Model

The trainee observes actions demonstrated by the trainer within the same spatial location where the task occurs. The trainee then repeats those actions using the real tools and components. This cycle of demonstration and repetition allows skills to develop through direct practice inside the workspace.

How we built it

The overarching system makes use of the Unity game engine to make extended reality experiences. We use the Meta XR SDK for visualizations of user avatars, the implementation of the mixed reality video passthrough, and interactions such as grab and move. This is then integrated with the Photon Fusion networking solution to connect the trainer and the trainee together. We implement a custom framework here to design and construct virtual replica prototypes of real systems, to enable the trainer to illustrate the appropriate motions to the trainee. Finally, Agora is utilized to stream the real-world video passthrough footage from the trainee to the trainer, allowing the trainer to convey spatialized instructions back to the trainee regarding their task.

Challenges we ran into

A key challenge was in the networking implementation that we designed. We had 2 pipelines, one which relayed the movements and motions of the trainee and the trainer to each other, and one which streamed live video footage. The two had to be separated on account of different services providing lower latency for different types of data.
A secondary difficulty was the custom virtual replica mock-up solution, and implementing a good user experience around seamlessly snapping together different objects. Smart glass interactions tend to be finicky at best since the tech is still not very mature, and as such, the interactions need to compensate for the low fidelity gestures by estimating the intended action in advance, along with giving helpful previews as feedback mechanisms.

Accomplishments We’re Proud Of

The system runs a synchronized mixed reality training session with interactive assembly tasks.
The prototype includes a passthrough workspace, a virtual trainer environment, and a scanned model used for assembly training. Both participants interact with the same digital components while observing the same sequence of steps.
Trainer gestures appear in the trainee workspace and guide the movement of parts during the assembly procedure.

What We Learned

Physical training depends on spatial demonstration and repeated interaction with equipment. Observation alone is not sufficient for procedural learning.
When a trainee watches a movement performed within the workspace and repeats that movement through direct manipulation, the action sequence becomes easier to remember. Repetition of those actions forms procedural memory associated with the task.
Shared spatial reference also affects whether remote instruction works effectively. Anchors and synchronized object states ensure both participants interpret the workspace in the same way.
Mixed reality environments can support training across different domains where procedures involve physical interaction with tools or equipment.

What’s next for Teth.AR

One->Many training sessions

The system architecture can extend to sessions where one trainer connects to several trainees. Each trainee runs a passthrough workspace while the trainer observes and demonstrates procedures from the VR environment. Object states and assembly progress remain synchronized across participants.

AI-indexed training library

Recorded sessions can be processed to extract interaction events, object manipulation, and assembly steps. These records can form a searchable library of training procedures organized by task, tool, or component.

Procedural task validation

Assembly components contain anchor constraints describing correct placement and orientation. The system can compare object transforms with these constraints to determine whether a trainee has completed a step correctly.

What We Learned

Physical training depends on spatial demonstration and repeated interaction with equipment. Observation alone is not sufficient for procedural learning.
When a trainee watches a movement performed within the workspace and repeats that movement through direct manipulation, the action sequence becomes easier to remember. Repetition of those actions forms procedural memory associated with the task.
Shared spatial reference also affects whether remote instruction works effectively. Anchors and synchronized object states ensure both participants interpret the workspace in the same way.
Mixed reality environments can support training across different domains where procedures involve physical interaction with tools or equipment.