Inspiration

Many non-democratic countries like China and Iran censor certain websites for a variety of reasons, limiting free speech and human rights.

What it does

This project is an introduction to identifying parameters that determine whether a website is likely to be censored in a specific country. The final product is a web app that determines how likely a certain country is to censor a website. This predictive model can be used by companies looking to expand globally, activists who need to know what sites will alert their government, and many others.

How we built it

-Hyperquack Echo raw data from Censored Planet -Python to extract, clean, and analyze data done in Jupyter Notebook -Figma for UI design -Visual Studio code for front-end -Streamlit to build the app

Challenges we ran into

-Clean, labelled data is hard to find -Figuring out efficiency with 26 million data samples -Manually categorizing data means that there is a chance for individual biases -Could only analyze one data set because of time limit & amount of data -Had to pick up several languages for front-end

Accomplishments that we're proud of

Having little experience, we were able to pull off a formal data science project and create a web app.

What we learned

-Data Analysis using Python -HTML/CSS for Front-End -Intersection between Tech and Politics

What's next for Detecting Censorship

-Take all the data analysis and create a probability function that would give the likelihood of a website being censored; connect script with web app -Train an AI model that would determine the popularity of the website (i.e. how likely it is to be censored by a given country) -Develop a product that could help website designers better format websites that won’t be censored -Collect data from more sources, dates, and countries

Built With

Share this project:

Updates