Overview of the field
Sample comparison between actors
Another shot of the field
In depth about our project!
The original idea for the project came from a desire to enhance the way people experience youth sporting events. By taking advantage of the abundance of cameras in everyone’s pockets, we believed we could provide an improved viewing experience for members in the audience as well as at home. However, due to the difficulty in implementing an IP webcam, we decided on an analytics approach with many wide-spread uses! By creating a distributed real-time localization system, we can provide advanced tracking of individuals for complex data analysis.
What it does
SportView is capable of a variety of tasks. It tracks and keeps track of unique individuals from four different cameras spread around a field. It doesn’t simply track them in an image, the system provides a precise location for each person. Once a person is identified and localized, that information is sent to Amazon Web Services for more complex data analysis. From the user’s location and identification, we can intuitively graph key metrics useful for sports! These include distance traveled, calories burned, instantaneous speed, ball possession time, and their position on the field.
In addition to these analytics for sports, the system could be extended to use in a variety of fields. Some examples include the construction industry, factories, warehouses, and security. In each of these areas, there is value to be provided by keeping track of individuals. Whether it’s optimizing resource flow in a factory or ensuring workers don’t go anywhere dangerous, SportView can help.
How we built it
The distributed localization system was created exclusively with OpenCV. The process starts with undistorting all images captured from the cameras around the field. This required performing calibration for each unique camera. After that, we used a neural network to identify where people are in the image. We also used HSV color thresholding to detect a ball. We match up detections of more than one person by analyzing the principal hue component of their shirt, which is crucial to localizing and identifying individuals. To perform triangulation, we assume a known location and orientation of each camera in the world. We use the camera calibration parameters and the center of the detected human box from two cameras to triangulate the user’s position. Once the position is known, the user ID and location is sent to Amazon Web Services.
The backend is run primarily through AWS, using technologies such as Lambdas, S3 and DynamoDB. The Structure of the backend is set up as follows: Our different nodes, each connected to 2 cameras each, send the lambda processed data on location of people in the shot, as well as coordinates of the ball, if it exists in the shot. This lambda takes the information, turns into a SQLless data structure and stores it in DynamoDB. Everytime a new item is put into this table, a different lambda is triggered to take the new information, parse it, and calculate some of the analytics we have available to our app. After this, this data is once again stored back in DynamoDB in a different table, ready for consumption by out web application. Our web application is written in Angular 7, and works closely with our AWS backend. This application is set up to get the calculated analytics from our backend via a Chalice app. This app returns the data in a JSON serializable format, which then the Angular app can store and do its calculations to. The web app then takes this data, and parses it through a JS library called HighCharts, which allows us to show visuals of the data in a graph format. And this data is updated every 2.5s, allowing for dynamic updates in the front end.
Challenges we ran into
Kellen faced a lot of issues related to the math of triangulating the person's position. Trial and error was crucial for catching and correcting some of the late-night math mistakes! Beyond that, we learned how important it is to have a good estimate of the camera locations and orientations.
Accomplishments that we're proud of
We're proud to call SportsView a distributed system. Being able to add as many nodes as needed to the system helps our software serve a wider variety of clients. The image processing that happens on each node helps relieve pressure on the backend. Localization is another aspect we are proud to say we designed. We asked ourselves: how can we create cutting-edge technology out of devices everyone has on hand? By using the difference in position of objects between images we are able to calculate those objects' positions.
What we learned
We learned a lot about the math behind computer vision. Our project dealt with several key aspects of computer vision: camera calibration, neural network implementation, object detection with color thresholding, and 3D localization of an object.
What's next for SportView
Contrary to the name, we designed SportView to have many applications outside of Sports. We believe that this system could easily be used for security purposes. All that we would need to add is a facial recognition systems and then it could be deployed on a each node and could communicate what it sees with the backend. Another application would be construction. Given our ability to identify people, we could look closer to see if they have the correct PPE on. We could also look at they're movement to see whether or not they are being safe. We believe that there are many applications for our system and this was just a couple examples.