Every year, 8 million tons of marine litter and macro-plastics enter our oceans. Millions more tonnes are already out there. If no action is taken, this number is expected to double by 2030 and double again by 2050, destroying our ecosystems, health and economies.
But this doesn’t have to be the case. At Team Andromeda, we envision a world where clear rivers flow, the ocean runs blue and everyone can enjoy Earth’s natural beauty. And this is because we recognise that any solution must fully leverage technology and evidence-based research.
Firstly, we can optimally direct existing cleanup efforts. Secondly, we can track the activity of major polluters to develop legislative and social solutions.
Given that most ocean plastics concentrate within 1 metre of the surface, and given that plastic hotspots move little over long timescales, the solution is clear. Our team aims to combine hyper-spectral imagery with machine learning to detect these very regions of marine plastic accumulation.
The coming rise of marine industries due to climate change and Polar ice melt, we believe our data serves valuable business applications, allowing us to lead the pack in pollution-related consulting. This expertise is invaluable to industry and governments alike.
What it does
Laying out the first steps to this vision, our team created a software and hardware proof-of-concept during the hackathon.
Software Component For the software component, we designed a machine learning algorithm to identify plastic from space. Imagery captured from satellites are fed into the algorithm, which identifies the pixels that are likely to be plastic. This process is known as image segmentation.
In parallel to this, we created a 3D-printed CubeSat chassis. Inside, we are running a basic plastic recognition algorithm as a proof of concept for our 1U satellite.
How we built it
As labelled satellite images of ocean plastic are scarce, we created a synthetic data pipeline to generate life-like RGB images. Satellite images were sourced from the Airbus Ship Detection dataset made publicly available on Kaggle. The dataset naturally provided other objects (that are not plastic litter) which would have posed a challenge to the image segmentation algorithm, such as ships, clouds, land masses, and other marine objects.
Images of floating plastic accumulations were sourced from the BBC. The plastic parts of the image were extracted, and several data augmentation techniques were utilised to improve the variety of training data on hand. These included random rotations, translations and resizing of these images. The plastic examples were then overlaid at random locations within the ocean in Airbus examples and their segmentation masks recorded. By following this method, we could be sure that we had a large and sufficiently varied labelled-dataset that we could successfully train our model.
The satellite plastic-detection model was written as a Convolutional UNet which aimed to segment images into plastic and non-plastic regions. We chose this model architecture for two specific reasons:
- Convolution is a well-proven mathematical technique to capture just the core features of data - in our case the plastic
- UNets proved successful in similar situations: the domain of Cloud Segmentation features large, fluid-defined objects being recognised at high altitudes often above ocean.
Hence, our main efforts were devoted to tuning hyperparameters of the model and correctly feeding in our data.
These processes were all conducted in the cloud using Amazon Sagemaker to utilise their powerful CPU and GPU compute resources.
Challenges we ran into
The greatest challenge team Andromeda encountered was sourcing the satellite imagery of ocean plastics that is necessary for the training process of our ML algorithms. As these datasets are extremely scarce and inaccessible to the everyday person, much of the team’s development time was spent producing synthetic images to feed into the training process. future developments would see us launching our CubeSat into LEO and capturing the necessary for the training dataset ourselves.
We also faced challenges dealing with our software environments. Many of us were inexperienced with the tools on AWS as we hadn’t spent time practicing on their systems. This led to time wasted dealing with interfaces, installing packages and restarting machines. As such, we learnt a lot about the value of using familiar tools and getting a feel for environments in advance. However, the main takeaway was to properly use documentation as Amazon’s library of instructional content helped us to turn things around.
Accomplishments that we're proud of
Creating a realistic data set, as outlined above in the ‘Challenges’, was one of the more time consuming processes of our project. We are proud that we were able to create this dataset and how realistic they were.
The team is also very happy that we were able to put together a trained model in time, as along the way we also ran into many issues running out of compute resources as well as issues transferring data between compute instances.
What we learned
Prior to the announcement of UniHack 2021, no member of the team had practical experience working with ML frameworks. As a team, we collectively worked together upskilling ourselves with the necessary knowledge base and technical skill sets needed to kick start a ML based project. As mentioned above, there were numerous technical challenges which we encountered and now know the necessary steps to avoid these mistakes in the future.
What's next for Andromeda
In future, we see ourselves developing the Satellite payload more thoroughly. As our team are all members of the student-led Melbourne Space Program, we plan on spinning this off as an internal project to launch as a full satellite platform. We particularly anticipate large student interest because of the strong values and mission at the core of our team’s environmental proposal. Given the progress made during 48 hours with few students, we believe that a more complete team could take this project to the stars (no pun intended).