Sometimes we just want our family and friends to be with us, even if we are miles away from them. The closest we get to this is video chatting with them- but it still feels unreal. This very thought inspired me to create an immersive way of video chatting wherein it would seem that the person you are talking to is standing right in front of you and both of the peers are in the same environment using Virtual Reality.
What it does
Sair, in Hindi means to go on for a peaceful walk or a stroll with someone. This is what this platform does exactly. The conceptual version picks up two people who are video chatting and semantically segment them from their current environment, in real time and put the person in virtual space of each other. The segmented body in the VR space makes you feel that the other person and you are both in the same space and thus justifying the term immersive video chatting...
How I built it
The whole platform was built in three layers.
Layer - 2) As soon as the video chat platform was ready, it was time for semantic segmentation. Semantic segmentation essentially means to pick the person's body from the current environment and subtract the environment from the video stream such that the body is the only object being streamed. I initially started to write my own neural net with OpenCV to do that, but the rendering of one video of 8 seconds in non-real time took about 1 hour on my local machine. Hence I came across, Bodypix library by google scholar which I tweaked a bit to get what I wanted and in real time.
Challenges I ran into
- React 360 is very restrictive way to do this. It does not support the segmentation part on user's end- hence we have to stream the segmented video. It also just plays video which are in the local asset store...
- Currently I am just playing a video which I have stored in the local asset store- but the dropbox app server request does work and the downloading does take place. It is just - I wasn't sure about the location of the directory where the downloaded file was saved and ran out of time to figure it out.
- Initially the segmentation also took a lot of time and efforts. The problem with segmentation is, currently the video is streamed from one user to another user, from where it goes to google's open source GPUs and then is streamed back after segmentation. This requires high processing and compression rate which as lossless as possible.
- Currently the app is not able to live stream the video chat because of React 360's limitation- but the idea can surely be implemented with Unity and better tech stack
Accomplishments that I'm proud of
- The very idea of immersive platform for video chat is an accomplishment in itself. It is going to revolutionize the social networking idea in future. It has applications in corporate and personal lives as well.
- The real time segmentation is lossy but still works. To pull off a video chat segmentation was a great learning for me.
- I made this alone with my summer grad school which was really hectic. This helped me to believe in myself that I can manage time in a more efficient manner and can give good results with such arduous parameters.
What I learned
- This is my first project in React 360 or virtual reality as a whole
- I created a neural net by myself for segmentation purposes, i didn't use it though, but still a learning
- Going back to basic p2p connections made me revise all the concepts that I had forgotten
- Helped me think in a more creative and indigenous way
What's next for Sair
- Implementing this full fledged platform using Unity
- Making it available on the VR stores. It must be compatible with all the VR devices
- Adding features such as haptic sensor hugs, haptic sensor handshakes, and other such features, which would give the person a sense of touch and feel along with the stream.
- Github Repo has two sub-directories-(Sair360Web for Virtual Reality App & SairWebDev for Layer1 and Layer2)
- Supports only google chrome for now..