With the emergence of new techniques of machine learning, and the possibility of using algorithms to perform tasks previously done by human beings, as well as to generate new knowledge, we again face a set of new ethical questions. These questions not only concern the possibility of harm by the misuse of data but also questions of how to preserve privacy where data is sensitive, how to avoid bias in data selection, how to prevent disruption and “hacking” of data, and issues of transparency in data collection, research, and dissemination.

What it does

Simply put, sensitive fields need to be extracted out of the raw text as named entities. Once anonymized, the information is private enough and processed for unstructured information management and analysis.

How we built it

  • Django Backend
  • React.Js Frontend
  • NLP

Challenges we ran into

  • Migrating models

Accomplishments that we're proud of

  • Implementing a basic, yet real-world solution
  • Our first Django Web App + API

What we learned

  • NLP
  • Encryption
  • Web Development with Django & Python

What's next for cTakes-Extension

Integration with cTakes Pipeline and other UIMA tools

Share this project: