Inspiration
I grew up in Regent Park when it was still all public housing. I remember one Christmas my sister and I got donated toys and clothes. We were pretty happy with the toys, and I guess our mom appreciated the clothes (new) since she was unemployed on family benefits, shopping at the Woolworths. There was also some donated food though I can't remember what it was, maybe Christmas cake, but it made our Christmas. So when I signed up for this hackathon, I thought about my childhood in this city and decided on doing something that can help with donations to non-profit orgs.
I checked some of our non-profit orgs and I noticed they don't have an AI chatbot on their websites. I think the reason is it's expensive for a non-profit to hire an AI developer to build a chatbot for them, or to upskill their in-house or volunteer web developer on AI. So I built "Donatobot" which is an AI chatbot that can be used for any non-profit as long as their website follows a standard template. The chatbot would give the non-profits exposure to users who prefer interacting or conversing with a website through free-form questions in natural language, instead of the old-fashioned way of clicking everywhere and reading through fixed text.
What it does
Donatobot is a web app that can be configured for a specific non-profit org to let any user on the web come and ask questions to an AI chatbot about the non-profit and get answers based on the pages of the non-profit's website.
For example, for this hackathon I configured Donatobot for the Yonge Street Mission's website and Covenant House Toronto's website, since both of these websites follow a standard template with a navigation bar at the top and a menu of options when you hover over each item in the nav bar. Donatobot uses those nav bar items to build a list of topics the user can optionally select from to ask their questions. To see what I mean, please try out my configured deployments of Donatobot at Donatobot for YSM and Donatobot for Covenant House Toronto :)
How I built it
Since I was told the judges aren't technical, I'll try to keep this high level avoiding jargon.
My code for Donatobot depends on 3 settings that need to be configured for a specific deployment: (1) the non-profit's website URL, (2) the HTML tag of the website's navbar (for example div or nav), and (3) the HTML CSS class of that tag. That's it, using these 3 configs, the code can be deployed to a web server and an AI chatbot will be running, ready to answer any questions about the non-profit based on its website pages.
To get the data from the website pages into a form that can be used for machine learning and thus AI, I wrote a loader script that parses the site's navigation bar to get the pages referenced by its submenus. The content from each page is extracted and split into chunks of text. Each chunk of text is embedded, meaning it's converted from text into a vector, which is a multi-dimensional list of numbers that represent the semantic meaning of the text (hence the need to chunk the whole page contents to obtain usable sized units of meaning, like paragraphs and sections). The vectors are finally loaded to a vector database, which is secured by a secret key. Along with each vector (list of numbers), the original chunk of text is stored, including the name of the submenu item for the page the text was retrieved from.
Then when a user asks a question in the chatbot, the chatbot embeds (converts) the question to a vector (list of numbers) as well, and queries the database to find the most similar chunk vectors. Since the original chunk texts come along with the chunk vectors, the corresponding text originally retrieved from the website pages is available and is sent to an LLM (AI large language model), in this case OpenAI (which is the LLM behind ChatGPT).
OpenAI in turn uses the text sent to it to answer the question which is also sent to it in an AI prompt. OpenAI sends the answer back to the chatbot which simply displays it. If the user also selected a topic for the question, the chatbot also includes the topic with the embedded question in the query to the vector database. The database filters the chunk vectors using the specific topic to narrow down the possible text that will be used to answer the question by OpenAI.
Although the whole process has many steps and the code isn't simple, for the tech guy who will deploy the code for the non-profit, it's just a matter of configuring 3 settings.
What's next
I noticed some websites of non-profits in the city don't have a FAQ section. So in the next version of Donatobot, I plan to implement a FAQ feature where FAQ questions and answers are generated using AI in real-time based on the non-profit's web pages. This will make it easier for new users to learn how the non-profit contributes to our community in this city, and hopefully encourage them to donate, which is the goal of Donatobot.
Built With
- beautiful-soup
- chunking
- cosine-similarity
- embedding
- gradio
- langchain
- llm
- openai
- pinecone
- python
- semantic-similarity
- unstructured-io
- vector-store

Log in or sign up for Devpost to join the conversation.