Inspiration

Computational approaches are proving ever more useful in extracting useful information from Big Data. We explore the question of Human Trafficking from a multi-angle perspective, deploying computational tools to extract useful narratives from data that is available freely online.

What it does

Our project looked at three intersectional issues: 1- We leverage general statistics on world human trafficking data to gain a clearer view on how trafficking victims are moved between and within different countries. We compare this information to similar statistics on the perpetrators, human traffickers, and how this all breaks down in relation to age, gender and causes of trafficking. 2- We use data-mining to extract key-words from reports by and about victims of trafficking. 3- We use data classifiers on human trafficking data to describe the relationships between victims of trafficking and their abusers.

Combining these three approaches, the aim is to provide law enforcement officers and anti-trafficking campaigners with a tool that helps them focus their efforts. Using our tools, they can better identify trends in trafficking to focus their efforts, better communicate with/identify potential victims of trafficking by watching for the right keywords during interrogations and easily identify possible at risk individuals from the nature of their relationships.

How I built it

The project was built primarily using R scripting. Data for the analysis was scraped from free online databases on human trafficking and police hotlines. Text-mining methods such as Random Forest and a number of different classification algorithms were implemented including Support Vector Machines, Naive Bayes and K-Nearest Neighbors. The r-Shiny program was used to build a responsive and dynamic User Interface for viewing the results of the excercizes.

Challenges I ran into

-A good number of the databases did not have readily downloadable data or even APIs for accessing the data they held. We had to get creative on how to extract the information from the websites and pdf/image files and then process and clean these into useful formats. -Human trafficking needs a multi-pronged approach, and it was a challenge to come up with a method that would look at and combine different aspects into a single useful narrative.

Accomplishments that I'm proud of

We extracted a number of insightful observations, such as that most trafficking actually happens intra-country as opposed to across borders. Even in cases where the trafficking is international, it is mostly localized withing the different geographical regions. Our classifier worked quite well in identifying the relationships that most often led to trafficking situations. With increased data, and access to better computational resources such as the IBM Watson, it would become an important tool in the pre-emption of trafficking. The key-words we generated, with better data, would improve how law enforcement interacts with and extracts information from potential/actual victims of trafficking.

What I learned

Data can be leveraged to be very impactful in combatting human trafficking. There are however very sparse actually useful datasets for the development of computational tools to aid in this fight. Computational approaches are an exciting frontier in the fight against this scourge of human trafficking.

What's next for GMU_CSI_PhD_Crew

Improve and scale up our tool, and incorporate additional aspects such as image recognition to do our bit in this fight.

Built With

Share this project: