The idea for this project came from trying to demo a project written at a previous hackathon and realizing just a year or two later it’s already out of date and unusable.

Without continuous maintenance, software quickly falls behind. Expensive engineering resources are devoted to maintaining and updating legacy applications instead of developing new revenue-generating features.

What it does

Custodian uses a combination of machine learning and semantic analysis to keep code maintainable, secure, and efficient.

How we built it

The deep learning models are based on GPT-3 and the semantic analysis uses heuristic analysis of abstract syntax trees (ASTs).

The tech stack uses a combination of a GitHub bot written in Node.js / Express / JavaScript / TypeScript and a web interface written in JavaScript / React / Chakra UI.

Challenges we ran into

Few-shot learning with GPT3 on large inputs GPT3 sometimes gets stuck in infinite whitespace

Accomplishments that we're proud of

Getting a semantic AST approach for refactoring to combine well with the GPT-3 deep learning model

What we learned

  • GitHub authentication/API
  • AST parsing
  • GPT3 Prompt Engineering

What's next for Custodian

Continue to iterate on the model and work with interested customers.

Share this project: