Inspiration
This was an interesting problem statement as it addresses the need for efficiency and accuracy in handling defense contract information. By using AI tools and technologies, we wanted to streamline the process of parsing and extracting valuable data from Defense Contract Notices.
What it does
DoDEx serves as an intelligent parser capable of automatically extracting key attributes, currently from Defense Contract Notices. It processes the notices to capture relevant information such as Federal Agency, Contract Amounts, Dates, Company Names, Locations and more. It then validates the information and gives us an accuracy for the information.
How we built it
We built the solution by combining various AI and machine learning techniques, including LLM (Large Language Model), information extraction, and data analysis. Initially, we scraped the website of Defense Contracts from the provided source using web scraping tools. Next, we employed LLM to parse the text and identify relevant entities and attributes within the contract notices. We extracted information such as Federal Agency names, contract amounts, dates, and company names.
Challenges we ran into
There were several challenges we faced like feeding the right prompt to our LLM, finding a key tool for validating our generative data. We found that the answer to that is Retrieval Augmented Generation (RAG). It's basically like a paradox - "How is my LLM generating correct data? because my LLM says so!!"
Accomplishments that we're proud of
Despite the challenges, we're proud to have developed a highly effective solution that addresses the complex requirements of extracting and categorizing defense contract data in a limited amount of time. Our system not only streamlines the process but also enhances the accuracy and reliability of the extracted information through advanced validation techniques.
What we learned
Through the development of this problem, we gained valuable insights into the intricacies of data extraction, categorization, and validation, particularly within the context of defense contracts. We deepened our understanding of the challenges involved in handling large volumes of heterogeneous data and honed our skills in leveraging advanced techniques to overcome these challenges effectively. Additionally, we learned the importance of robust validation mechanisms in ensuring the accuracy and reliability of extracted data, especially in critical domains such as defense.
What's next for DoDEx
Further ahead, we are aiming to expand the scope of this project by refining the prompt to increase accuracy and we also plan to enhance our solution to different domains like law, education, healthcare and more!
Built With
- generativeai
- guardrails
- openai
- provenanceembedding
- pycharm
- python
- rag
- vscode

Log in or sign up for Devpost to join the conversation.