Inspiration
First let us define what Protected Health Information is: link
I worked at 2 big healthcare companies including one where I was the lead engineer on a team that created internal apps used at hospitals. Both companies went_to_great_lengths to try to ensure that PHI was not being forwarded or posted on external systems (chat tools, team boards, emails, messages etc). However, it is almost impossible to eventually leak some information given the amount of sources where this kind of information exists. Before you know it, for example, a seemingly harmless screenshot of an app sent around on a team's chat system has lead to several costly violations - even more concerning is the fact that several external tools might not ever actually delete something once it's been posted there cough "soft delete" cough.
HIPAA (linked to above) asks covered entities to implement safeguards to ensure PHI is not leaked, however this is hard to do as it most times requires manual intervention and manual removal of this data.
What it does
This hackathon projects combines several aspects of AI (computer vision, text classification, REGEX) in an iOS app that can scan an image and blur out any found PHI. It also lets a user tap to add a bounding box around something it missed so it can be blurred, or also remove a box for something that was incorrect.
How I built it
I used a library from Google to scan images for text, a library from Apple to detect faces, and then added my own rules for classifying other data that could be PHI (names, email addresses, phone numbers etc). All classifcation happens on device so that information does not leak over the network.
Challenges I ran into
Accuracy of its predictions. I think it works very well for a hackathon project, but it is not perfect in its predictions. That's why I need YC backing, and why I am a hopeful candidate for your next round.
Accomplishments that I'm proud of
It seems to be about 90% accurate!
What I learned
Computer vision is hard but can be really powerful when accurate.
What's next for PHIlter
Getting into YC's next batch & recruiting data scientists and engineers to make the world's best health information classification software. I'd like an Android version of what I just built, and a desktop client which can take in documents and scrub the PHI out of them.
Built With
- computer-vision
- ios
- machine-learning
Log in or sign up for Devpost to join the conversation.