Inspiration
I always get frustrated with AirDrop and with 'sync' like iCloud or Dropbox or Notes (or e-mailing or texting myself) as ways of getting stuff between phone and laptop. Feels like it would be way better to just be able to point my phone at my laptop, like use the fact that the phone and laptop exist in the same real-world space.
What it does
Still really rough and WIP (and probably not online as you read this), but you point phone camera at page on your laptop, then phone camera uses the marker on the page to figure out where the phone is with respect to your laptop, then it casts a 'flashlight beam' in blue of (roughly) where your phone is pointing at on your laptop screen. Can move phone around to move its footprint on laptop.
How we built it
Mostly JavaScript, some computer vision libraries, some Go.
Challenges we ran into
Tried just using QR code at first, QR code doesn't actually detect reliably (so you don't want to do live tracking with it) + can't do the pose estimation (OpenCV.js doesn't include the PnP solver)
Needed camera intrinsics to do pose estimation, didn't want to go through camera calibration, ended up making a little iOS/ARKit app to just spit out what the iOS API provides as camera intrinsics matrix
Phone camera use requires HTTPS, so had to do weird reverse proxy stuff (ngrok, then ran out of that so localhost.run, constant domain churn so I was refreshing all the time, finally got a VPS and just ran my Go server + Caddy TLS reverse proxy on that and pointed cone.omar.website at it)
Wanted to use accelerometer to keep tracking smooth / interpolate / dead-reckon even if phone can't see the tag. (Especially useful under the QR code plan, since QR codes are unreliable to detect). This is doable but hard! Classic sensor fusion and filtering problem...
Still drops a lot of frames / doesn't feel smooth, so the interaction isn't satisfying. Maybe do an app instead? Maybe do P2P communication laptop<->phone instead of running through cloud server?
Accomplishments that we're proud of
Getting everything connected and talking
Tag detector works! Runs fully in browser
Pose estimation and camera intrinsics seem to work, it moves properly with my phone
What we learned
Working with sensor data would be hard. QR codes not that reliable. Browser permissions very annoying. AprilTags have a lot of nice properties! It's nice that it's a self-contained library.
What's next for Cone
Actually do the sensor fusion? Try to fix dropped frames? Build actual application/demo where you can send info and/or interact with laptop from phone. Want to be able to rotate phone so it points, instead of having to look through phone screen -- feels like that Wii-Remote-type interaction would be better.
Multiple colors for multiple people's phones so it's a fun multiplayer activity. Could scale up to a projector or TV...
Log in or sign up for Devpost to join the conversation.