Bespectacled Belligerents' Liquid Classifying Neural Network

Schematic for our Arduino and Peripherals
inside liquid sensing zone (illuminated and isolated to remove varying ambient light)
inside liquid sensing containment - showing illumination, webcam, TdS probe, and PH probe)
View of amplifier circuit w/ voltage divider
View of circuit (and TdS sensor module)
Higher level of circuit - AD2 for power, arduino for sensor data processing

Inspiration

We had an impromptu brainstorming session before HackED. The Criteria were as follows:

Integrated Systems project (Hardware and software)
Challenging, but not impossible within the 24hr
Incorporates machine learning (cliché, we know)
Unlikely that another group would have the same idea With these in mind, the Liquid Classification Neural Network was born!

What it does

The Liquid Classification Neural Network identifies liquids based on three parameters:

pH (using a pH electrode)
TDS (Total dissolved solids- determined using a conductivity sensor)
Colour (using a webcam) In the early stages, we trained the device on ten basic beverages and household substances, allowing it to form a database. From there, the machine learning algorithm should be able to relate most given liquids to ones it knows based on the gathered properties.

How we built it

Hardware: 1. pH Sensor The pH sensor itself consists of an electrode probe and a BNC connector. It works thanks to a special composition of glass that detects the hydrogen ions in the liquid. Using a generic male BNC connector, we provided the electrode with ground as reference, then amplified its output using a JFET OpAmp. with this setup, we were able to accurately test the pH powders that came with the sensor. For a pH range between 4.9 and 9.2, we saw a voltage spread between ~-150mV and ~200mV which was discernable enough for our purposes.

2. TDS Sensor The TDS, or Total Dissolved Solids sensor works like a voltage divider. Depending on the conductivity of the fluid, the sensor returns a different analog voltage value. Once this value is received, we can find the parts per million of conducting solids.

3. Colour Sensor To sense colour for this project, we repurposed an old webcam. As expected, the webcam is plugged into the computer running the code by USB. It captures 60 frames, and we use the last 30, take the center 150 pixels, average them and take RGB values.

Software: Before we got ahead of ourselves with machine learning, we had to make sure we could read data from the sensors. The pH and TDS sensor communicate directly with the Arduino, which sends their inputs via serial communication (pyserial module) to python, where most of the processing is done. The colour sensor communicates directly to the python program using the OpenCV library. From there, we wrote the output data from all sensors to an excel workbook using the pandas library. Once sufficient data was collected; a metric we set at 600 values (60 per sample), we trained a three-layer neural network using keras and tensorflow. We trained it on 80% of the data and tested it on the remaining 20%. It achieved 93% accuracy on the training data and 88% on the testing data. In our main script, this model took the sensors' current outputs and used them to predict the provided liquid. Finally, we created a GUI using pygame to improve user-friendliness.

Challenges we ran into

Op-Amp circuit A problem we were constantly working around is that the Arduino cannot detect negative voltage readings. This caused problems with pH sensor especially because a pH of 7 (neutral) outputs 0V. Initially, we implemented an OpAmp circuit we found on line which provided the electrode with a 500mV reference. This would in theory allow lower pHs to dip the voltage value up to 500mV before hitting zero. This circuit was, however, eventually discarded after we realized JFETs, or specifically the TL082s require symmetrical power to function properly. Because this opened a whole new can of worms, we elected to simplify the circuit, using ground as the electrode reference and then amplifying the output. To make the voltages swing positive only, we used a 10k pull up resistor and an 820 resistor between that and the amplifier output. These resistor values pulled the output positive while allowing buffer so that lower pHs wouldn't create negative voltages at the Arduino.
Reading Training Data We initialized too large an array of zeroes and didn't fill it fully for the test data. This meant we were testing the machine learning algorithms on a bunch of zero valued vectors.
Forgot to normalize vectors We trained the algorithm on normalized vector. However, when we used the keras models to predict liquids we forgot to normalize the inputs. This made milk a VERY popular choice.
Substance differentiation Initially, we wanted our program to differentiate between tap water and filtered water, as well as Coke, Pepsi and diet Pepsi. Sadly, this proved to be infeasible. For the water, we realized we aren't even sure if the University of Alberta has separate water system to source from. The water data was overlapping and indistinguishable between the "filtered" and "tap", so we merged the two into a single water classification. As for the Coca Cola, Pepsi, and Diet Pepsi, the PH, TdS, and color data was very similar and frequently overlapped. This made training the model to differentiate impossible on the limited number of samples we could collect.

Accomplishments that we're proud of

The program is able to determine the difference between seven of our sample substances! It gets confused between Coke, Pepsi and diet Pepsi, but hey, who doesn't? We are proud to have built a project that incorporates so many components in such a limited amount of time. This project integrates serial communications, signal processing, analog circuitry, digital I/O, GUI, machine learning and data management into one cohesive and functional project.

What we learned

Machine Learning is a powerful tool, but ultimately it can't compensate for shoddy data. A project with so many components is only as strong as its weakest link. In our case this was our hardware. We acquired our sensors last minute and cheaply, and therefore they are the greatest limitation of the project, with not enough resolution to do classification to the extent/depth we'd hoped. A diverse team is essential for a project that combines so many different aspects. Our team rose to the challenges despite the variety of expertise required for this project. This was also a first venture into machine learning for most of our group, and we had very limited experience with it. We learned a great deal about applying these concepts to a practical problem.

What's next for Liquid Classifying Neural Network

As time goes on, we can keep providing the LCNN with new liquids, expanding its dataset and so improving its classification abilities. We could also implement more hardware to test different parameters, increasing its capabilities. Though, more than anything this project was a first venture for us into the use of neural networks. We were hoping to gain experience implementing machine learning in the ever-changing modern age, and especially find ways to integrate hardware (sensors, etc...) with the software. Such a circuit serves as a demonstration of the power of combining physical components, even analog circuitry, with machine learning. A more robust liquid classification module could be used in various contexts such as medically to examine biological fluids for health, or in commercial settings to confirm product quality.