Check in #2 Introduction We are planning to implement an existing paper (https://arxiv.org/pdf/1411.4555.pdf). The paper solves the problem of structured prediction and classification. Its objective is to use a neural network which processes images to generate a caption for the image. We chose this paper because this seemed like an interesting and practical way to use the skills we’ve learnt in this class. Who hasn’t taken a picture and wished that it was automatically given a really objective caption!
Challenges So far our work has been focused on preprocessing our data into a usable format. The Flickr8k dataset we’re using contains the data in a somewhat disorganised manner, so the work we’ve done so far has focused on making sure the data (images, captions) are organised in a logical manner that will make training and testing easier for us. We’ve also begun the process of collating the data for usage in Python, which has proven slightly more difficult we expected as the images in this dataset are in a different format than what we’ve seen in class so far. We may yet have to make changes to how this is organised, but we’re satisfied with the way it’s been set up so far.
Insights At the moment, we don’t have anything concrete to display in terms of a model. We do, however, have some of our preprocessing completed to set up the data in Python in a similar manner as that recommended in the paper.
Plan We think that we’re probably slightly behind on implementing the paper, but both our schedules are more free in this coming week than they have been in the past, so we’re sure that we have the time to devote to it in the near future. We need to finish our preprocessing, then begin work on the components of our model (the CNNS, embeddings) and the architecture of the model. We aren’t thinking of changing anything in particular at the moment. In the next few days, we plan on finishing preprocessing, designing, implementing and training our model, then testing and making changes as necessary. Post that, we will work on our ablation studies and qualitative evaluation.
Log in or sign up for Devpost to join the conversation.