Inspiration
Smart contract audits are essential, but the supply of competent auditors is nowhere near enough to meet the demand of contracts needed audits. A bottleneck is created, which can delay product launches for weeks or months at a time. I wanted to see if AI could help devs "pre-audit" their Solidity smart contracts to spot and clean up bugs related to known Solidity contract vulnerabilities.
What it does
PreFlight accepts pasted Solidity code or uploaded Solidity files as input, evaluates the code to identify vulnerabilities, and generates a report to identify those vulnerabilities, classify them, and explain why they are a problem so developers can work to remediate these items before sending the contract to professional auditors. I feel this will help cut out low-level bug finding and reporting for actual auditors and free them up to focus on novel or complex threats and should ultimately result in a faster auditing pipeline for everyone.
How we built it
I spent an unhealthy amount of time combing GitHub repos and scooping up things like audit reports, OpenZeppelin contracts, Solidity documentation, and more. There is a full catalog included in the repo. Most of this work was a first-time adventure for me so, I leaned heavily on ChatGPT and the GPT-5 Thinking model for assistance. Ultimately, we came up with a set of seed examples and I was ready to run the baseline eval. I used Ollama to test gpt-oss:20b on a 15 item test eval to see how it performed "out of the box" using my M1 chip on my laptop. It was slow but it worked (findings in RESULTS.md). I was unable to train the gpt-oss:20b model (more on that below) but decided to go ahead and validate the pipeline on a smaller model on Hugging Face. I ran a new baseline eval on the SmolLM2-1.7B model, trained it on a held-out validation split using a teacher forcing technique, and then evaluated performance again. The model saw a 39% decrease in validation perplexity which felt like a pretty good improvement to someone new to the game.
I ultimately stood up a rough UI on top of the Hugging Face Space where I did the training so users can paste in or upload their code and try the analysis tools.
Challenges we ran into
I was super naive about what goes into training LLMs, but have also been deeply curious about it for a long time. I was surprised when being able to run the gpt-oss:20b model on my M1 didn't translate to being able to train it there and I nearly decided not to finish the project. Instead, I pivoted to trying to validate the training pipeline itself to demonstrate training would lead to improved performance in this specific domain.
Accomplishments that we're proud of
Having something to submit at all. I had never trained an LLM, created a dataset, used Hugging Face, or many other things that I managed to learn and achieve during this project. I'm really happy I stuck with it and I'm looking forward to dipping into ML and NLP more very soon!
What we learned
I expected training to be hard. I didn't realize it's building the dataset that is the real work.
What's next for PreFlight by FirstPass
I would absolutely love to partner with someone like Ethereum Foundation, a DOA, or even an auditing organization to strengthen this tool, make it beautiful and easy to use, make it available to the community, and keep it maintained as vulnerabilities evolve. I would also like to explore expanding to Rust and other languages.
Log in or sign up for Devpost to join the conversation.