Inspiration
The inspiration for "50 Shades of Green Tape" comes from the need for transparency and efficiency in clean energy deployment and environmental permitting. The team envisions a system where various stakeholders - from American citizens to developers, agencies, decision makers, and clean energy stakeholders - can easily access and utilize permitting data and environmental impact information at a local level.
What it does
This project is a RAG-based (Retrieval-Augmented Generation) system that provides geospatial insights for NEPATEC 1.0. It uses document embedding and similarity search to offer semantic and geospatial insights from a vast database of over 300,000 embeddings from 1,000 projects. The system allows users to query information about environmental impacts, permitting processes, and clean energy deployment at a county level.
How we built it
- We used NEPATEC 1.0 as the base dataset
- Document chunking for processing large texts
- OpenAI Embedding for creating vector representations of documents and queries
- Geopy.geocode for handling location data
- Similarity search algorithms to find relevant documents and locations
- Kepler for visualization of geospatial data
Challenges we ran into
While not explicitly mentioned, challenges likely included:
- Processing and embedding a large volume of documents (over 300,000 embeddings)
- Integrating geospatial data with textual information
- Designing an efficient similarity search algorithm for quick retrieval of relevant information
- Creating a user-friendly interface for various stakeholders with different needs
Accomplishments that we're proud of
- Successfully processing and embedding over 300,000 documents from 1,000 projects
- Creating a system that can provide both semantic and geospatial insights
- Developing a solution that addresses the needs of multiple stakeholders in the clean energy and environmental permitting space
What we learned
- Large-scale document processing and embedding techniques
- Integration of geospatial data with text-based information
- The complexities of the NEPA process and environmental permitting
- The importance of data accessibility in decision-making for clean energy deployment
What's next for 50 Shades of Green Tape
- Mapping the most relevant environmental impacts for every county based on similarity embedding research
- Identifying federal and local agencies involved in the permitting process for each county
- Conducting sentiment analysis of applications, including approved and unapproved ones, for each county
- Potentially expanding the system to cover more regions or types of environmental data
- Refining the user interface to make it more interactive and accessible for different stakeholder groups
Built With
- docker
- embeddings
- fastapi
- geopy
- kepler
- openai
- pgvector
- postgresql
- python
- react
Log in or sign up for Devpost to join the conversation.