Inspiration
The Department of Defense is one of the biggest investors in the world, so we wanted to extract valuable information from their contracts so that the DoD and the contractors make informed decisions and promote transparency among the people.
What it does
The project uses generative AI to extract relevant information from the Department of Defense contracts.
How we built it
- Web-scraped the contracts present on the https://www.defense.gov/News/Contracts website using Python's BeautifulSoup library for HTML parsing.
- Extracted these contracts into a JSON format using the Gemini API.
- Tabulated the JSON string into a CSV file.
- Classified the type of work in the contract.
- Visualized and highlighted insights such as the locations where most Department of Defense (DoD) projects are taking place, the mean cost per contracting company, and the domains to which the projects belong.
Challenges we ran into
We ran into a hurdle while testing the accuracy of data extracted from contracts as there are very few tools or methods. We came up with the approach of vectorizing the data to create a RAG Stack but we faced challenges like creating chunks. Subsequently, we came up with the idea of testing by comparing the insights generated using the extracted data with Domain Knowledge.
Accomplishments that we're proud of
- Integration of generative AI in our project to tabulate defense contracts.
- Classification of contracts into different fields (Maintenance, Manufacturing) based on their purpose.
What we learned
- The use and integration of OpenAI API and Gemini AI.
- Extracting data through websites.
- How Domain Knowledge can be used to test and validate data.
- How to visualize data using Tableau.
What's next for AI-Driven Defense Contract Insights
To create an inverted index search engine and implement a generative AI chat model to search through it effectively.
Log in or sign up for Devpost to join the conversation.