Inspiration

When Backpage, ECCIE and other sites which enabled sex trafficking were shut down, transactions dispersed making it difficult for local and federal agencies to monitor these activities. Much of the web traffic assumedly has gone to the deep and dark web, but this project was designed with the hypothesis that some of this traffic has been hiding out in the open on large social media sites.

What it does

I converted Epic Solution's Human Trafficking: Street Dictionary PDF into a searchable database system and backend trigger for keyword targeted scraping on mainstream social media sites like Twitter. After a minimum of two keywords are selected, a limited set of tweets containing these keywords can scraped from Twitter. By using IBM's Watson's Knowledge Base Studio and feeding it the database terms and archived BackPage, Craiglist, and ECCIE ads, the tweets can be identified with a degree of confidence on whether the content is comparable to the ads from BackPage and the like and should be flagged for monitoring. Furthermore, using Twitter's API to determine the tweet accounts friends and followers account and then tracking the UIDs of the mutually followed friends could help determine a pattern that would enable to local and federal agencies to identify the trafficker's social media account.

How I built it

As this was a hackathon of one, only the converted database and twitter scraper were completed. Mentors Blaine and Brian provided access to the archived classified ads and the slang dictionary for me to work with respectively, and IBM-Watson mentors Lee and Julian assisted with setting up accounts and working with IBM-Watson--the custom dictionary is pending. Julian and Lee also helped me scale this project into a realistic MVP.

Challenges I ran into

Twitter scraping constraints kept locking me out of my tests.

Accomplishments that I'm proud of

Successfully was able to find a former ECCIE ad on twitter and several similar ads which matched the hiding in the open hypothesis I began this project with.

What I learned

How to scrape with PHP (I've only ever used Python for this); how to register for and use twitter, and more git commands.

What's next for Newbee

Frontend focus and adding the IBM-Watson ML component to validate target tweets.

Share this project:

Updates