Inspiration

Some people think better on their feet-literally! In a world of back-to-back online meetings for school, work, and research, many people find themselves sitting in the same place for hours on end, unable to even stand up to stretch for fear of missing out on a meeting. If you're like us, sometimes you just have to get up and pace back and forth to really get your creative juices flowing, so we created wandr, the device that allows you to remain connected to your meetings while walking around!

What it does

Wandr operates as a virtual webcam that actively knows when you are sitting in front of your computer and automatically switches to the onboard webcam system as you get up, allowing you to get up and move during your meeting without missing a beat! By wireless connecting to your computer, you are able to move wherever you want to go as long as you remain on the same network! This allows you to stretch, pace, get a coffee (or Red-Bull) refill, and just about anything else while remaining engaged in a meeting. When you return, it automatically switches you back to your original setup, allowing you seamlessly transition back to your workstation. With a wrist-mounted control system, you can also mute, turn off your camera, and hang up (with indicator LEDs!), so that you can remain in control while away from your workstation!

How we built it

Physical Interface

The physical interface is built on a raspberry pi, which takes the data from the wrist-mounted switches and sends it to the server via a custom REST endpoint, while creating an IP webcam using the stream from the physical webcam.

Server

The server receives picks up a video stream from both the computer's integrated webcam and the IP webcam from the raspberry pi. Triggered by the raspberry pi, the server will switch from the integrated webcam feed to the IP webcam feed. The server is implemented using Python/Flask.

Challenges we ran into

It was difficult to bridge both switching between streams while also functioning as a virtual webcam, but once we figured that out, setting up the rest of the server side functionality (the REST API) was fairly straightforward. It was also difficult to physically package everything "neatly" enough to be portable without falling apart. (A LOT of zip ties were involved!)

DirectShow

Our original idea for the server was to write a DirectShow filter. Unlike the RTSP feed from the Pi, DirectShow video capture devices can be used directly in apps like Zoom and Discord. Though the DirectShow API is deprecated, the apps that we targetted (Zoom and Discord) both use this older API instead of the Windows Media Foundation driver that was meant to replace it. However, documentation and sample code for DirectShow video capture filters was extremely sparse. After an entire day spent debugging our filter, we were eventually forced to give up. Luckily, existing DirectShow filters that accept RTSP feeds as input exist, as long as we were willing to provide our own separate method of switching between the inputs. Using Python's pynput library to send Zoom-specific keyboard commands from our Flask server worked out alright for our proof-of-concept.

def key_sequence(keys, pause=0.1):
    for key in keys:
        keyboard.press(key)
    time.sleep(pause)
    for key in reversed(keys):
        keyboard.release(key)

def cam_switch():
    key_sequence([Key.alt_l, 'c'])
    state['main'] = not state['main']

Accomplishments that we're proud of

Since neither of us had ever worked with UI/UX, it was difficult to figure the best way to make this method intuitive, but we believe that a physical interface was the best method of achieving this.

We're really happy that despite challenges setting up a virtual camera device, we were still able to piece together a solution that (while a bit limited in features) accomplishes our goal while placing minimal burden on the user.

What we learned

Though we were not ultimately successful, there was a lot to learn about debugging DirectShow filters. We also gained a better understanding of how the DirectShow API works at a high level, which is a great jumping off point for similar projects in the future.

We also learned more about working with audiovisual data from a wireless transmission standpoint. Mainly working with hardware and raw data, this was a particularly fun learning experience for both of us!

What's next for wandr

We would like to expand this to increase camera feed security and to reduce the size of the physical interface while upgrading the camera to allow for a wider field of view at a closer distance. This would allow for the system to be less intrusive while participating in teleconferences in a traditional manner.

Additionally, by implementing the DirectShow filter mentioned above we can gain support for more apps, improve overall stability, as well as add additional image processing features like background removal and stabilization that Zoom might have out of the box, but other apps like Discord do not.

Built With

Share this project:

Updates