Inspiration
Many non-democratic countries like China and Iran censor certain websites for a variety of reasons, limiting free speech and human rights.
What it does
This project is an introduction to identifying parameters that determine whether a website is likely to be censored in a specific country. The final product is a web app that determines how likely a certain country is to censor a website. This predictive model can be used by companies looking to expand globally, activists who need to know what sites will alert their government, and many others.
How we built it
-Hyperquack Echo raw data from Censored Planet -Python to extract, clean, and analyze data done in Jupyter Notebook -Figma for UI design -Visual Studio code for front-end -Streamlit to build the app
Challenges we ran into
-Clean, labelled data is hard to find -Figuring out efficiency with 26 million data samples -Manually categorizing data means that there is a chance for individual biases -Could only analyze one data set because of time limit & amount of data -Had to pick up several languages for front-end
Accomplishments that we're proud of
Having little experience, we were able to pull off a formal data science project and create a web app.
What we learned
-Data Analysis using Python -HTML/CSS for Front-End -Intersection between Tech and Politics
What's next for Detecting Censorship
-Take all the data analysis and create a probability function that would give the likelihood of a website being censored; connect script with web app -Train an AI model that would determine the popularity of the website (i.e. how likely it is to be censored by a given country) -Develop a product that could help website designers better format websites that won’t be censored -Collect data from more sources, dates, and countries
Built With
- figma
- python
- streamlit

Log in or sign up for Devpost to join the conversation.