Themis

Inspiration

Human trafficking is one of the biggest issues in the nation, especially regarding prosecuting traffickers. The sheer amount of evidence renders the evidence-analyzing process nearly impossible given the lack of manpower and time. This makes cases rely heavily on victim testimonials, which puts a lot of stress on the already traumatized victims. We wanted to find a way to relieve the pressure on the victims and help prosecutors more efficiently convict traffickers using tangible evidence which we help identify through data and processing analysis.

What it does

Themis takes all the data in a case folder and extracts the information inside each file as text, imagery, and audio data. Each data is passed through specialized software to analyze and identify the important evidence within, or if it is helpful evidence at all. Our text analyzer looks for similar keywords or language, as traffickers often use code language when trafficking their victims online on popular networking sites such as Craigslist or Snapchat. Our facial recognition software can identify a human face in an image, and will be able to match faces to identify traffickers in videos, such as security footage. Eventually, it will be able to model these faces and find them elsewhere in the case files, even under vastly different lighting conditions and angles. The audio analysis can transcribe and match voices on tapes from wire taps and other recordings to match with offenders. One situation this is helpful with is jailhouse phone calls, where suspects make many phone calls often for the purpose of witness tampering. This audio analysis can analyze these calls and transcribe them and, combined with our text pattern system, give law enforcement an idea of what a call might be about.

How we built it

The website was built using javascript in Velo Wix, and the three analyses programs were coded in python. The text analyzer was built using TF-IDF and cosine similarity to quantify the importance of words in given text. If two texts have a high cosine similarity, it suggests they share a lot of similar words or phrases, and may cover similar topics or themes. The facial recognition software uses open CV and the algorithm cascade classifier and analyzes imagery to detect facial structure. Eventually, we will develop it to be able to build upon this foundation and match faces with given subjects as well as work on videotape and images that are in a worsened condition. The audio analyzer uses deep learning, dealing with sound waves. After converting audio files, It develops spectrograms to create a visual representation of the "loudness" of a subject's voice, calculates the spectral rolloff, zero-crossing rate, and chroma feature.

Challenges we ran into

Coding the cosine similarity formulas was especially difficult and required a lot of research, teamwork, and trial and error. The dlib library, which we were intending to use in our facial recognition code, was unable to successfully download due to compatibility issues with windows. This meant our code missed out on the package to be able to use facial recognition on video and comparisons. However, we worked as a team to try our best to overcome these challenges and enhance our project as best we could within the time frame, and are confident that given more time, we would be able to overcome many current obstacles.

Accomplishments that we're proud of

Human trafficking is a very sensitive topic that we were a bit worried about delving into. However, we were really passionate about working towards a solution for this heinous crime that often remains hidden and taboo in our society. We are proud to be able to shed a light on this serious issue and bring forth some innovative tech ideas that will be of use to society for years to come. And of course, we are proud of our teamwork, and how we bonded and put all our heads together to create something all of us are passionate about and believe can change the world for the better.

What we learned

Human trafficking is a very taboo topic that many of us didn't know much about before working on Themis. During our research, we learned so much about this crime and how it targets our children and loved ones, and how much victims suffer even after their traffickers are caught, as so much of the burden of proof rests on their testimony, which isn't fair at all. We learned that if we can transfer the burden of proof onto the tangible evidence hidden in those enormous evidence files, they will be able to spend more time healing and justice can be served. Additionally, this was our first time working with javascript and facial recognition, and we found it super cool to work with website building with javascript and using python to analyze faces.

What's next for Themis

With more time, we plan to expand and refine the capabilities of Themis. The text analyzer code would be able to enlarge the data set and make a classification for the labels. The facial recognition program would be able to make comparisons with the faces it recognizes, both on film and pictures, and be refined to be effective on degraded file quality or faces that aren't turned toward the camera exactly. The audio system will be further built upon so it can recognize voice patterns even when the tape is distorted, such as if traffickers intentionally masked their voice by raising the recording octave. If expanded on a global scale, law enforcement can take a more victim-centered approach, as they will quickly be able to corroborate victim statements and prosecute traffickers. Human traffickers have taken much of their crime online, and it's time we fight back with technology, too.

Built With

cosine
ipython.display.audio
javascript
jupyter
librose
natural-language-processing
opencv
python
velo
wix

Submitted to

HackNYU 2023
- Winner [MLH] Best Domain Name from Domain.com

Created by

did facial recognition

kevin Wang
I worked on creating the website, handling user input and file uploads, and the slideshow for the presentation.

Lianna Poblete
I worked on text analysis, using cosine similarity handilng user text input, and sift userful informations.

Minghe Yang
Annie Zhang

Updates

Annie Zhang started this project — Feb 18, 2023 10:34 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.