General Framework
Data Schema
Interface from data validator's side
User interface

Inspiration

While searching for ideas, our team came across large datasets that needed more effectively classified and were challenging to make predictions based on. Upon examining these datasets, we observed that many could be cleaned and maintained to create models capable of making profound predictions. Following this initial idea, we conceived the idea of developing a website designed to grow exponentially over time, continuously improving its accuracy as we update the models to accommodate expanding datasets. We started by addressing urgent human health issues, such as skin cancer. We developed the website to predict the likelihood of an individual having it, with the support of medical validators, namely doctors.

What it does

The system comprises the following components:
A general user interface in the app, where users can upload images for verification.
A cloud-based machine learning algorithm.
A validator interface within the app for verifying data (in this case, images) that produce positive results, which can be either true positives or false positives. The workflow involves a continuous data exchange between the general user and the machine learning algorithm to detect anomalies. When a trained outcome is predicted, a notification is sent to the validator, who then assesses whether it's a false positive or an accurate positive prediction. If it's a true positive, the validator verifies it and updates it in the positive dataset. If it's a false positive, the validator does not verify it, and it is not added to the positive dataset.

How it works

User side: Users can upload images to the model to receive predictions.
The model evaluates the image.
The result is returned to the user.
If the result is positive, the app requests the general user's permission to transfer the image to a validator.
Validator side: Upon the upload of images and data, the validator assesses the data's quality. After 100 uploads, the validator decides whether to approve the data. Approved data is seamlessly integrated into the dataset, resulting in an automatic update of the AI model. With the app's help, we can harness the power of big data to develop highly accurate predictive models.

Building the system

We attempted to create a website using the Reflex development environment to upload images and establish communication with validators to verify them. To accomplish this, we employed a Convolutional Neural Network (CNN) model as our machine learning algorithm, which enabled us to process the continuously incoming data and classify it into "normal" or "abnormal" states. If the data exceeded a threshold for "abnormal" occurrences, we directed it to the validator for verification. Once the validator confirmed the images, we harnessed the capabilities of a database and AI from MindsDB to store and categorize these images. Our models will undergo regular updates using MindsDB's power to enhance accuracy. Consequently, this approach will deliver a more precise and practical prediction application, especially when dealing with large datasets.

Example use case

Consider a dermatologist using this service to develop and maintain an AI-powered skin cancer detection system. The dermatologist can upload images of skin lesions along with their corresponding diagnoses. Subsequently, the service will train a model for skin cancer detection. The dermatologist can then employ this model to expedite and improve the accuracy of diagnosing new patients.

Challenges

We faced various challenges and issues throughout this project. Adopting new technologies, such as Reflex and MindsDB, required us to grasp and implement their syntax. Nonetheless, these efforts will yield a more efficient website than current technologies.

The second issue we faced was the limited size of our dataset, which restricted our ability to measure accuracy effectively. In the future, it is advisable to test the model with diverse datasets, explore sequential algorithms, and optimize layers and performance. In the current state, lacking a live information feed and a comprehensive dataset, it is challenging to determine issues related to overfitting or underfitting.

Future applications

In the next phase of our remote project, we have several key objectives:

We aim to optimize the system's information processing speed when called upon.
We plan to expand our network of validators and acquire more datasets to enhance the accuracy of our models.
We intend to enable the system to automatically generate reports and provide personalized recommendations based on the models.

Built With

mindsdb
reflex

Updates

Minh Tam Nguyen started this project — Oct 29, 2023 02:01 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.