UNSD Goals Knowledge Graph
Contributors and Contact Information: Khyati and Ninad Majmudar
Problem Statement addressed:
GRAPH FOR BETTER WORLD - Enable Search For United Nations Sustainable Development Goals
Inspiration
The UN Sustainable Development Goals (SDGs) are the 17 Goals defined by thousands of global community representatives together with the United Nations - covering interconnected social, environmental, and economic targets. They span interconnected systems – where impact in one target area are often related to dynamic phenomena in other areas. Also, loads of information is available with various agencies in unstructured form, which can help policy makers make the right decision for it's implementation. Hence, there is a need of a robust Knowledge Graph System which can parse the unstructured data and present a solution which can uncover the hidden connections.
What it does
Our Solution is a Knowledge Graph Application on the Cloud, which has the capability to scrape various PDFs, APIs, Text Files and HTML Pages and gather the unstructured data and thereby, run Machine Learning Algorithms to extract key Concepts and Topics for each Goals and the Documents. And then, it runs Community Detection Algorithms to determine which Goals and Documents are inter-connected and share common topics. The result is displayed in a user-friendly Web App.
How we built it
Our Solution is built on Tiger Graph, leveraging its robust Graph DB and its capability to run Machine Learning Algorithms. There are 4 Key components to the solution:
1. Web/PDF Scrapper & Data Extractor
Using Python and its modules, we have built a state-of-the-art Data Extractor, which can scrape data from PDFs, APIs, Text Files and HTML Pages. This Extractor is used to scrape relevant data from various UN, World Bank and other Reports and passed on to the ML Algorithms
2. NLP - Machine Learning Algorithms
We use Latent Dirichlet Allocation (LDA) using collapsed Gibbs sampling as well as Non-Negative Matrix Factorization (NMF) algorithms to extract Topics and Keywords for each of Data Sources as well as the 17 UNSD Goals and store it in Tiger Graph Database.
3. Tiger Graph Machine Learning
Once all the nodes & Edges are available in Tiger Graph, we use a modified Community Detection - Label Propagation algorithm on the Tiger Graph itself to cluster the various Documents and Goals into their own community - to understand the inter-linking of the Goals and Documents
4. React / django Web App
And Finally, we have a responsive React App, with a Python django API, connecting to the Tiger Graph DB using pyTigerGraph Module – The app allows a user-friendly interface to understand the inter-linking of goals and documents, as well as gather common topics between various documents
Key aspects of the solution
Impactfulness
The solution impacts all researchers and policy makers across the world, who are striving to meet and enforce the 17 UN Sustainable Development Goals. The solution helps them to understand the inter-connections between the goals and how one goal can affect the other. Moreover, it helps to inter-connect multiple rich contextual data published in reports from various development publication portals, such as the World Bank, United Nations, International Finance Corporation, and similar.
Innovativeness
The solution is based on the concept of Knowledge Graph Theory, which is a powerful way to unlock data out of unstructured data sources. We have built a state-of-the-art Data Scrapper which can read data out of PDFs, HTML Pages, Text Files as well as connect to various Open Data APIs to gather raw data and convert to a meaningful Knowledge Graph using Latent Dirichlet Allocation (LDA) using collapsed Gibbs sampling as well as Non-Negative Matrix Factorization (NMF) algorithms. Then, it uses the power of Tiger Graph to run ML algorithms on the Knowledge graph to uncover hidden connections and present the data in a user intuitive Web App.
Ambitiousness
The graph solution is highly scalable and can accommodate the 33000+ document sources mentioned in the problem statement. In fact, it can even go beyond and scrape various data sources and documents found on the internet.
Applicability
The solution is production ready where it can gather data on the go and help uncover the insights from UNSD Goals. Having said that, we have designed the solution in a generic form, where in, it can allow different industries, enterprises, etc to connect their unstructured data sources and obtain a Knowledge Graph out of the vast data that they possess.

Log in or sign up for Devpost to join the conversation.