TL;DR

Due to strict privacy laws and regulations around medical data, hospitals face issues developing AI models and data-driven technology, as they cannot share or access data from other hospitals.

With healthNET, hospitals and other medical clients can collaboratively train AI models with a larger pool of data, all without ever exposing the raw data. We built this through an edge computing model with federated learning, allowing each user to individual update the aggregated model locally. Our approach preserves privacy and security, all while improving AI performance across institutions.

Inspiration

The healthcare domain benefits from advanced data analytics. According to RBC Capital Markets, approximately 30% of the world's data volume is being generated by the healthcare industry. This data can and should be mobilized to improve patient outcomes.

Modern AI models for healthcare require large amounts of patient data to be accurate. Hospitals, clinics, and research institutions often cannot share patient data due to privacy laws and ethical concerns. This makes it challenging to train and deploy machine learning models on shared datasets. Unfortunately, models that are trained only on a single institution’s data are prone to being more biased and less accurate. Thus, health institutions are unable to maximally benefit from the large amounts of existing data.

What it does

Enter healthNET: an application than enables health institutions to train neural networks without direct data sharing. healthNet utilizes federated learning, a machine learning technique that trains a model collaboratively across multiple devices without centralizing raw data, prioritizing user privacy and data security. healthNET empowers users to benefit from large amounts of existing data without sharing their data and worrying about communication overhead.

How we built it

In traditional machine learning, data must be centralized into a single dataset that the model is trained on. In federated learning, instead of bringing data to the model, the model is brought directly to the data.

A federated learning systems comprises node servers and an aggregator server.

In our solution, the node servers represent healthcare entities like hospitals. The node servers retain total ownership of their data, which is never shared.

The aggregator server is responsible for coordinating training across the node servers. The aggregator maintains a global model, which is sent out to the nodes. The nodes then update the model on their own datasets, and only sends the updates (weights, gradients, etc.) to the aggregator server, which combines the updates to improve the model. Across multiple rounds, the model improves in performance without direct sharing of data, thus protecting patient privacy.

There are multiple methods of combining the model updates, but healthNET uses FedAVG, which simply averages model weights.

Our design utilizes Flower, a popular federated learning framework. The application backend is built with FastAPI, while the frontend is built with ReactJS. Shoutout to the HackUMass sponsor Vultr, who provided us free credits to host our aggregator and train models on their cloud platform.

Challenges we ran into

Most of the difficulty with the project was managing the many dependencies and learning to work with the Flower framework. Designing and building out the different components for both the server side and client side was a difficult yet rewarding challenge. Our team benefitted from adhering strict version control guidelines and frequent disucssions to ensure we agreed upon a shared goal.

Accomplishments that we're proud of

Our team is most proud of the rate at which we learned to work with federated learning. We were able to implement a distributed system that is simple to use and scale easily.

What we learned

We learned how to train machine learning models on the cloud, work with the Flower framework, design a maintainable distributed system, and create multiple frontend views.

What's next for HealthNET

We plan to implement the following features:

  1. Authentication
  2. Expanding model offerings
  3. Enabling clients to upload different datasets
  4. Implementing desktop application for clients

Built With

Share this project:

Updates