Inspiration
News channels often give biased interpretations towards political candidates, and candidates often have thousand-page economic & social plans. Who has time for that? We wanted to create an easier way for voters to get direct answers about what each candidate will do for their largest problems.
What it does
The Ether Engine provides voters with a streamlined, conversational way to explore candidates’ policies. Users select a candidate and ask their most important questions. The chatbot accesses the campaign's website and plans to answer the questions directly. This allows voters to compare policy positions and make more informed decisions. A “Steampunk Mode” option adds a fun twist, making the experience more engaging while maintaining information accuracy. The tool covers both presidential and senate candidates.
How we built it
We developed The Ether Engine using Streamlit for the user interface. We wanted to use an retrieval augmented generation (RAG) model to directly answer from the campaign's website and plans. To retrieve policy content, we used BeautifulSoup to scrape text directly from candidates’ websites, filtering out unrelated elements like donation prompts. For semantic understanding, we employed paraphrase-MiniLM-L3-v2 as a sentence transformer model for embedding and cosine similarity analysis, helping match user questions with the most relevant website sections. The chatbot is powered by OpenAI’s GPT-4o-mini, fine-tuned to provide accurate, context-based responses.
Challenges we ran into
Data Quality: Candidates’ websites vary widely in structure, so filtering out irrelevant content was challenging. We fine-tuned our filters to ensure that only relevant policy information was included. Some candidates are in very uncompetitive states, so they provide very little information on their campaign issues and plans.
Embedding Model Limitations: Balancing performance and accuracy with the embedding model required careful tuning, especially given the real-time nature of user queries.
Bias Reduction: We aimed to avoid any bias in responses by strictly using the candidate's own content, yet ensuring the chatbot could recognize and exclude irrelevant or fundraising-based information.
Accomplishments that we're proud of
Reliable Information Delivery: We successfully created a tool that extracts, filters, and serves relevant candidate policies, making it accessible to users in a friendly chatbot interface.
State-Specific Coverage: The tool covers Senate races across various states, adapting to different candidates and their unique content styles.
Engaging User Experience: The addition of “Steampunk Mode” not only provides accurate information but also makes the user experience engaging and memorable.
What we learned
This project taught us a lot about the challenges of web scraping dynamic, real-world data for natural language models. We gained experience in using similarity-based retrieval methods to maintain relevance in responses and learned about designing conversational experiences with user engagement in mind. Additionally, we explored different ways to handle unstructured data, filtering for accuracy and consistency.
What's next for The Ether Engine
Expanded Candidate Coverage: We plan to add all house & governor candidates. Real-Time Updates: Implementing real-time content updating to ensure policies reflect current stances. User Feedback Integration: Allowing users to provide feedback to continually improve response relevance and accuracy. Enhanced Modes and Themes: Introducing more interactive themes and modes for various user preferences, making the chatbot even more versatile and enjoyable to use. Games: Adding fun games for the user to enjoy exploring different perspectives and policies
Built With
- beautiful-soup
- flask-(for-backend-api-handling)-natural-language-processing-(nlp):-langchain-(for-llm-chaining-and-memory)
- github
- langchain
- natural-language-processing
- open-ai-api
- openai-api-(for-conversational-responses)-data-processing-&-embedding-models:-beautifulsoup-(for-web-scraping-and-content-extraction)
- pandas-(for-data-manipulation)-apis-and-libraries:-openai-api
- plotly
- python
- scikit-learn
- scikit-learn-(for-similarity-computations)
- sentence-transformers-(for-text-embeddings-and-cosine-similarity)-visualization:-plotly-(for-interactive-maps-and-data-visualization)-data-storage:-chromadb-(for-storing-and-managing-embeddings)
- sentence-transformers-api
- streamlit
- streamlit-cloud
Log in or sign up for Devpost to join the conversation.