The Juntos UI
GIF
Image generated on the "male" spectrum
Image generated by the GAN
Whiteboarding the data pipeline
Conversion code - interface between webapp, GAN, feature vectors

Inspiration

In recent news, many immigrant families have been torn apart by the ICE as the government attempts to crack down on illegal immigration. This often results in long term separation or even indefinite separation of the parents and children, since the parents, many of whom are still fighting to stay in the US, have little to no information or time to track down their children at one of many federal detention centers. Juntos' mission is to create an avenue by which these separate families can be reunited. Because it is typically hard for young children to reach out to their families other than being able to recognize them, we have employed methods such as deep learning and computer vision to help create a network of intelligence that will hopefully bring these families together.

What it does

Juntos, in a broad sense, serves as an image to individual matching system. Users, including both parents and children at detention facilities, will submit a photo of themselves and a photo of the person they are looking for, if they have one. In the case that the individual lacks such a photo, we use a generative adversarial network (GAN) to help reconstruct an image of an individual from only basic descriptions of facial features such as face shape. Juntos then uses a mobile platform to create a common platform for users to identify one another, with features such as location maps of images in the database, as well as generating suggestions of others' profile pictures that closely resemble the target description.

How we built it

The GAN model behind Juntos comprises of two pretrained models, one of which is the CelebA dataset that has facial biometrics labeled for each image, and the other is the Flickr-Faces-HQ dataset, an extremely diverse set of pictures of faces. Next, we used these models to create a mapping from measurable facial features to the latent space, or the input space to the GAN, thus allowing us to recreate faces based off a set of descriptors. In order to make Juntos a more user friendly experience, we experimented multiple types of UI to select the features (buttons and sliders). First of all, we tried out buttons which had one major drawback: selection would not always be precise because some features were interconnected (e.g. increasing the probability of a goatee on the face naturally made the entire face appear more masculine, even when 'female' was selected). In the end, we decided to go with a slider system, which allowed for free form configuration of features without confusing the user.

Challenges we ran into

One challenge that we ran into was that when we were calculating the mapping from the feature space to the latent space, the amount of data we had to use to have a good map was extremely large, and exceeded the capabilities of the server. As a result, we had to use clever pipelining to be able to do this processing, such as computing the mapping in a series of iterated steps. Another challenge we faced was integrating the project components. Specifically, because our project had a large range of technologies, we ended up each working on our own portion of the project at first, so integrating the project effectively at the end was crucial to our success.

Accomplishments that we're proud of

One accomplishment we're proud of was that we managed to understand the details behind the GAN algorithm through reading many papers and experiments online, even though we did not come in with any prior experience with them. We also put together a full functional pipeline in a condensed amount of time. One of our group members even managed to connect on LinkedIn with one of the first authors of the GAN!

What we learned

Through building our project, we learned good design and version control practices that helped prevent major setbacks when things went wrong. We also learned about servers and various other technologies that together comprise a full stack project.

What's next for Juntos

With regards to the technical aspect of our project, we would like to improve the feature extractor we used. As noted above, it is currently trained on celebrity images, and so is biased toward features that celebrities tend to have. Training the feature extractor on a dataset more representative of our target population, with labels more closely corresponding to features we want to tune, would make the generator be easier to work with for arriving at a target image.

This project can be generalized to not just reuniting migrant families, but also searching for people in general. Some possibilities are searching for lost children and missing persons, or allowing law enforcement to create more accurate pictures from eyewitness accounts. Swapping the backend out for GANs trained on other datasets would allow for searching for other things as well - in particular, a GAN trained to generate pictures of dogs, or pictures of cats, would allow people to use this app to search for runaway pets.

Built With

Submitted to

TreeHacks 2019
- Winner [Cerebras] Best Deep Learning Hack

Created by

I implemented and trained the mapping between feature space and the latent space (input to the GAN), which allowed for tweaking features from a starting image.

Ethan Ordentlich
I worked on the GAN network design and training, and also helped oversee the results of the GAN model make it through the backend and front end.

Erich Liang
I managed the overall strategy, and created the Nodejs backend and the Python-to-Node message queue that connected the neural net to our backend. I also worked on some of the frontend in React, and set up the AWS server to host our web app.

Alexander Cui
Random idea generator with a good filter
Raphael Vigee

Updates

Alexander Cui posted an update — Jun 21, 2019 01:58 PM EDT

Update 1: More stable face generator!

We've been figuring out how to make Juntos a more expressive and also realistic face generator. Primarily, we've fixed two main problems:

The generation of unrealistic, oversaturated faces
Changing an attribute like nose size also changes age or gender

After consulting with the ML community at Caltech and digging deeper into recent papers, we've converged on the following solutions:

Seeding the generator's latent vector with a random starting image
Orthogonalizing the "age" and "gender" axis with all the other attributes

Here's a brief explanation of why it works:

The problem with our demo previously was that our latent vector, which generates an image, was initialized as 512 zeros. This causes any perturbations to create huge relative differences between vector values (basically, dividing by zero). By seeding the vector with a 512-dimensional standard normal, we are able to have much more stable generation.
Naturally, our GLM finds like attributes like "hairline" are heavily correlated with "age". By orthogonalizing, we remove the "age" part of "hairline" vector to create a purer "hairline" attribute that you can manipulate without changing the rest of the image.

Next steps

We hope to pursue facial reconstruction next, so you can modify images that you can upload! The greatest barrier to that is creating a cycle-consistent face encoder, which will require some sort of auto-encoder or bi-GAN. To be continued!

Log in or sign up for Devpost to join the conversation.

Ethan Ordentlich started this project — Feb 17, 2019 09:53 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.