AlphaZ

Setup of the OpenTrons Robot with Testtubes and Arduino Sensor
Measurements of Conductivity
Reinforcement Learning Policy
Composition of Fluids

Inspiration

Recent advances in artificial intelligence and machine learning have shown to disrupt a wide variety of industries and solve important real world problems. Inspired by the underlying technology of Google Deepmind’s AlphaGo, we apply reinforcement learning algorithms in medical research. Our approach automates and accelerates the drug discovery process. Our general-purpose learning algorithm is able to find the optimal solution without knowledge about the underlying characteristics of the substances. Costs are reduced by making drug discovery cheaper. In the future, concepts of our approach can be used to make personalized medicine more affordable.

Problem:

The drug discovery process is very time and cost intensive. It takes more than 10 years to develop a drug and it costs AstraZeneca about $11.8 billion dollars per drug (2012). One of the main problems is that researches have to test new compositions of substances mostly by hand. This process is not automated yet and can therefore not be parallelized.

What it does

AlphaZ uses machine learning to automate the drug discovery process. The Arduino board measures the conductivity of the mixture to evaluate the quality of our solution. The OpenTron robot adds substances to the mixture based on the algorithm’s reinforcement learning policy. The brain of AlphaZ works with an adaptive Monte-Carlo algorithm, which is a reinforcement learning algorithm that finds the optimal solution with a trial and error approach. Our approach is unique that many different compositions of fluids are tested and optimized in one test tube, and no knowledge about the physical and chemical characteristics of the substances is required by the algorithm. The global maximum is found by parallelizing the proposed process on many test tubes.

How we built it

Hardware Setup: To apply the AlphaZ algorithm in our experimental setting, our hardware setup must be able to sense and interact with the mixture. Therefore, we build a test set-up with a sensor which measures the quality of a mixture. In this experiment, we used conductivity as a parameter for the quality of our solution. The OpenTrons robot is used to add, subtract and mix substances. The sensor is serially connected with jumper wires to an Arduino, using electrical resistors (1 MΩ + 1 MΩ +260 kΩ = 2,26 MΩ). The wires in the mixture are set in 2 mm distance. By applying 5V onto the wires and measuring the arriving voltage on the other side of the circle, we can measure the conductivity. These measurements are averaged and sent to the computer via a serial connection. The OpenTrons robot has a similar set-up to 3D printers, which allowed us to hack the system and control the robot directly via G-Code. All processes as absorbing a substance into the pipette needed to be coded. We developed our own interface allowing us to control the robot.

Algorithm Design: Reinforcement learning models are defined with a set of states, actions, transition probabilities, and rewards. Our states are defined by the current proportions of substances in the mixture. The actions are defined by the possible substances which can be added. If an added substance to the mixture has a positive effect on the desired measurement, the policy of that mixture is rewarded, or punished otherwise. Therefore, the likelihood that a proper substance is added to the mixture is increased. After a certain number of iterations, a nearly optimal solution is found based on the learned reinforcement learning policy. Repeating this experiment multiple times can exclude that the found optimal solution is only locally – due to the random sampling approach implemented in the Adaptive Monte-Carlo algorithm.

Challenges we ran into

Our AlphaZ reinforcement learning algorithm needs reliable measurements of the mixture for a good learning behavior. Noise or errors lead to reward wrong decisions. As our sensor is self-built and not the core offering of our solution, the noise of the measurement sometimes influenced the algorithm’s learning behavior. We solved this challenge by adapting our algorithm and adding resistors to decrease the volatility of the measurement. Controlling the robot required an own interface to interact with our hardware setup. We solved this challenge by writing our own Python code that combines the OpenTron robot platform and the Arduino conductivity sensor with our reinforcement algorithm.

Accomplishments that we're proud of

Based on state of the art algorithms, we applied a novel reinforcement learning algorithm to a new problem domain. In our experiment, we proved that our algorithm correctly finds the optimal mixture.

What we learned

The medical industries develop novel and advanced medicine, but with very high research and development costs. There is room for improving the labs by transferring novel technologies from different fields into the labs. Machine learning, in particular the trial and error approach of the Monte-Carlo algorithm, can solve tasks which were performed manually before. The algorithms not only cut cost and time, but also increase the quality of the solution by largely increasing the number of experiments. Additionally, we learned to use the Opentrons robot to solve the automation of the drug composition task using pipettes.

What's next for AlphaZ

Possible joint research projects with AstraZeneca enable us to create a proof of concept. By connecting to AstraZeneca researchers, we can find application areas for our technology. Furthermore, we will conduct market research and create a first businessplan.