Inspiration

This project draws inspiration from the fact that LLMs aren't able to reason very well and hence, they end up making mistakes. With this project, I aim to reduce errors which are made by LLMs during coding python programs.

What it does

The project inputs a user prompt, which is the description of the needs of the user. This prompt is then processed by the reasoning agent which simplifies the prompt by breaking it into sub-parts for easier understanding. This simplified prompt is used by the code generator and test generator agents to create code and test cases respectively. The simplified prompt along with a test case goes to the test case verifying agent to find the test case's validity. The valid test cases are then used in a pytest file to see how many test cases actually passed the check? The best code is then, returned.

How we built it

There are two core ideas used in building this project.

Multi-Agent

The first is of using a multi-agent cycle where each agent(LLM) has a specific role to play. This is similar to the human world wherein specialization helps develop better stuff but comes at a cost of speed, since communication is key.
The LLMs in this case are using Qwen2.5-72B model since, that is the only model Qwen2.5 model available using API in the hugging face playground.
Four Agents were made for this task:

  1. Reasoner: Simplifies the tasks
  2. Code Generator: Codes the simplified tasks
  3. Test Case Generator: Creates the Test case
  4. Test Case Verifier: Validates a Test Case

Verification

The second is to verify the code generated to make sure what is being passed up to the user actually works.
For this task pytest library in python was used. The code is written to a python file having the pytest library and then, the code is executed using the subprocess library in python.

Result

The best score code is returned to the user.

Challenges we ran into

Design Pattern

To constantly take data from one agent and keep parsing to the other agents, or using the code. Requires a need for a consistent design pattern in place.

Additional Inputs Handling

Some parameters are not understood by the lay man but might be important for seasoned experts, it took some time to figure out a way to add these to gradio.

New Line Handling

As we are writing test cases for validation, it fails if a new line character is being used so had to replaced the newline character with a special symbol which is not on the keyboard.
## Accomplishments that we're proud of

A multi agent coding project which doesn't make noob errors.

A live project which is hosted on hugging face and can be used by anyone with an internet connection

A project which can help people learn

It could introduce coding through Natural Language to those who don't know coding and can help people who are stuck on a task by providing efficient solutions.

What we learned

How to make a multi agent cycle
Handling Pytest file with just python code. Didn't think it wasn't possible
Adding additional inputs to gradio

What's next for ProCODER

To improve its speed by removing use of API calls, can be done by training a LLM ourselves but wasn't possible due to the small time limit given.
Adding the ability for the user to provide some test cases which must be satisfied

How use it locally

git clone https://huggingface.co/spaces/eternalBlissard/ProCODER
cd ProCODER
pip install -r requirements.txt
python app.py

Also need to export Hugging Face Token

export HFTOKEN=API_TOKEN

Built With

  • api
  • gradio
  • huggingface
  • multi-agent
  • pytest
  • python
  • qwen2.5
  • subprocess
Share this project:

Updates