Playful Code Review System

We’ve built a first version of a playful and engaging way to learn programming by turning code reviews into something closer to meme culture.

Given a Python code snippet, our system detects common errors, anti-patterns, and coding malpractices, and transforms them into contextual feedback paired with a meme and a generated message. The result is a “roast-meets-education” experience designed to make learning from mistakes more memorable and less intimidating.

The tool is currently focused on Python, but the architecture is language-extensible.


Idea

Instead of just receiving a grade, we reimagine feedback as something social, funny, and shareable:

  • Professors can use it to give students structured yet entertaining feedback for smaller code submissions or exercises
  • Students can use it peer-to-peer to highlight issues in code in a lighthearted way
  • Feedback can be customized by tone (funny, educational, sarcastic, etc.)

How it works (high level)

1. Dataset creation (offline layer)

We built a curated dataset of coding malpractices by combining:

  • Ruff linter rules
  • Python anti-pattern repositories from GitHub
  • Bad-practice datasets and code examples

This process also involved web scraping and HTML parsing due to inconsistent formats across sources.

On top of that, we added custom “fun” rules to make the system more relatable (e.g., discouraging meaningless print statements like "here", overuse of emojis in formal code, or confusing variable reuse).


2. Two-layer detection system

We experimented with multiple approaches:

  • Static + rule-based analysis
    Using tools like Ruff and structured analyzers to detect standard issues.

  • LLM-based reasoning model
    A prompt-tuned LLM trained on our curated malpractice dataset. This approach proved more robust and flexible.

Overall, the LLM-based detector achieved stronger coverage and better generalization.


3. Meme retrieval system (semantic matching)

Once issues are detected, we map them to memes using:

  • FAISS vector search index
  • Sentence Transformers (Hugging Face embeddings, specifically all-MiniLM-L6-v2)

Each meme is encoded as a vector, and the detected issue acts as a query. A KNN-style retrieval is performed to select the most appropriate meme for the situation.

Mathematically, retrieval can be expressed as:

$$ \text{meme}^* = \arg\min_{m \in M} \; d\big(f(\text{issue}), f(m)\big) $$

where:

  • f is the embedding function
  • dis a distance metric (cosine distance)
  • M is the meme dataset

4. Meme + humor generation layer

Finally, we feed:

  • the detected issue
  • the selected meme
  • user-selected tone (funny / educational / sarcastic / etc.)

into an LLM that generates a contextual joke or explanation.

A small randomness factor ensures variety and avoids repetitive outputs.


Why it matters

This project turns code feedback into something:

  • more engaging for learners
  • more expressive for teachers
  • more social and shareable for peers

Instead of just saying:

“this is wrong” we say it in a way that sticks.

Built With

Share this project:

Updates