3D Material Synthesis
Estimate high resolution surface normals from a single view using multiple light sources. This convolutional approach is used to the estimate the surface geometry for photo-realistic rendering. Upload photos from your mobile phone and create materials for Unity3D or Blender.
Generate Data with the Universal Render Pipeline in Unity3D
4 light sources are placed around a scene to mimic LEDs on a tripod-like structure
The lights turn on individually and a screenshot is captured for each. A labeled data set is created with the simulated images and their corresponding normal map.
The normal map encodes information about how bumpy or curved the surface is so that light can interact with it in a realistic manner. More information about normal maps be found here: https://docs.unity3d.com/Manual/StandardShaderMaterialParameterNormalMap.html
Here is the difference between a surface with and without a normal map while being illuminated with a directional light at 45 degrees
Finding Textures Online
The training data is composed of high-resolution textures with normals from sources like https://www.substance3d.com/
A web crawler is created to find training data on websites that provide free textures
To use the script follow:
python webscrape.py --BASE_URL https://3dtextures.me/ --PATTERN https://drive
To download folders from a list of google drive links use:
python gdrive_download.py --file download_links.txt --dir train
Format the data and then import the directory
train/Textures/ into Unity
Use the script
ScreenCapture.cs within Unity to generate training samples for a CNN. The training data is augmented within Unity to account for different perspectives & small distortions (e.g. rotations, translations and cropping). Set the file path before running the "TrainingSamples" scene. Ignore all moments Unity tries to convert the texture type to a normal map. The normal map will be set to the albedo/base map in order to generate a ground truth label. The rendering is weird when generating a ground-truth if the normal map texture type is set to "normal map", just keep it as "default".
Machine learning model
INPUT: 4 images 480 x 320 corresponding to light from 4 different angles
OUTPUT: 1 image 480 x 320 px corresponding to a normal map
The architecture of the neural network consists of a few convolutional layers, it looks more complex than it is due to the multiple inputs.
The training was done by optimizing for the mean squared error using Adam with a batch size of 8 images. The neural network was trained with 1500 samples for 20 epochs on a GTX 1070 (total training time ~30 minutes). A total of 12,035 trainable parameters.
An LED strip with an arduino can simulate the training environment from Unity. A pyramid structure is required to stablize a camera and position the LED strip. A tutorial for how to construct a PVC frame is coming soon... Here is the first light with the Arduino:
A sequence of picutres from a cell phone can be uploaded for the model to perform inference on
from PIL import Image import matplotlib.pyplot as plt from model_train import build_cnn_encoder if __name__ == "__main__": img = np.asarray(Image.open("test.jpg")) encoder = build_cnn_encoder( input_dims=[(img.size,img.size,3)]*4, layer_sizes=[ (8,8,8) ]*4, combined_layers = [8,8,8], output_dim=(img.size,img.size,3) ) encoder.load_weights("encoder_weights.h5") output = encoder.predict([X0,X1,X2,X3])
The normal maps can then be rendered in Unity by creating a new material and setting the Base Map and Normal Map fields
An Arduino Uno with a NeoPixel LED strip is used to capture images with a mobile phone. The LED Strip is: https://www.adafruit.com/product/2562. More information about getting started with the NeoPixel Strip can be found here: https://learn.adafruit.com/adafruit-neopixel-uberguide. The code for the strip is here: