posted an update

At the time of submission due to some issues I wasn't able to write the description section for the project completely so I am posting it as an update.

Accomplishments that we're proud of

I was able to create a dataset that was diverse enough that my model could annotate videos taken in different lighting conditions and different types of handwritten content like geometric shapes and formulas.

Reconstructing images from image crops in a temporal group proved to be a non-trivial task since the bounding boxes are at slightly different positions. I managed to accomplish it by first determining the coordinates of a reconstructed bounding box that encapsulates all bounding boxes within a group. The image crops from the bounding boxes were added to a sum matrix (which had the dimensions of the reconstructed bounding box) while simultaneously maintaining a matrix for number of times an addition is made at a particular pixel index. Using this count matrix and the sum matrix , an average is found out which is the reconstructed image.
Sometimes

What we learned

Development of python based GUI application.

The ease with which an object detection model can be trained with pytorch.

What's next for Whiteboard content summarizer

When the dataset for annotated video lectures with different colored boards becomes available, The project can be scaled to any kind of board by making minor tweaks in binarization section of the pipeline.

Camera movement can also be accounted for by changing the way we define spatial groups.

Log in or sign up for Devpost to join the conversation.