About CodeGenius Notebooks

CodeGenius is an AI-powered platform designed to dramatically lower the barrier to entry for data analysis and machine learning. It transforms a user's raw data (CSV) and a simple goal described in plain English into a fully functional, well-structured Python notebook.

The Inspiration

The idea for CodeGenius was born from a simple observation: countless professionals and students possess valuable data but lack the coding expertise or time to unlock its potential. I saw:

  • Business analysts like "Dina," who understand their data but struggle with the syntax of libraries like scikit-learn.
  • Students like "Bima," who need practical examples of how data science workflows are structured in the real world.
  • Even seasoned data scientists like "Rian," who spend countless hours on the repetitive, boilerplate tasks of data loading, cleaning, and initial exploration for every new project.

My goal was to build a bridge over this technical gap—to create a tool that automates the mundane, educates the curious, and empowers the domain expert.

How It Was Built: The Tech Stack

I architected CodeGenius as a modern, full-stack web application with a clear separation of concerns:

  • Frontend: Built with Next.js (React) for a fast, responsive, and interactive user interface. The notebook display itself uses a lightweight code editor library to feel familiar to users of Jupyter or Colab.
  • Backend: Powered by Python and the FastAPI framework, chosen for its high performance and native async support, which is perfect for handling API calls to the AI model.
  • The AI Core: The "magic" is driven by a powerful Large Language Model (LLM) like OpenAI's GPT-4o or Anthropic's Claude 3. The key is not just the model, but the sophisticated prompt engineering behind it.

The workflow is as follows:

  1. A user uploads a CSV file and writes their objective.
  2. The Python backend analyzes the CSV's schema (column names, data types).
  3. A detailed, structured prompt is engineered. This prompt includes the system's role, the data schema, the user's goal, and strict instructions for the output format.
  4. The LLM processes this context and returns a structured JSON object containing an array of notebook cells (each with its type, 'code' or 'markdown', and its content).
  5. The Next.js frontend receives this JSON and renders it into a clean, readable, and interactive notebook.

Key Challenges & Solutions

  1. Ensuring Code Reliability: LLMs can be unpredictable and sometimes generate buggy code.

    • Solution: I mitigated this through rigorous prompt engineering. By providing the exact data schema and clear, step-by-step instructions, I guide the AI to produce more accurate and relevant code. I also structured the output to be modular (cell by cell), making it easier to debug or modify one part without breaking the whole notebook.
  2. Simplifying the User Experience: The backend process is complex, but the user-facing experience had to be effortless.

    • Solution: I designed the UI to have a single, clear user journey: Upload -> Describe -> Generate. There are no complex settings in the MVP. The goal was to make the power of a complex system accessible through a single click.
  3. Defining the MVP Scope: The temptation to add features like live code execution or advanced customization was huge.

    • Solution: I strictly adhered to the core value proposition for the MVP: generating a high-quality, downloadable .ipynb notebook. This ensured I could launch with a stable, useful product and validate the core idea before investing in more complex backend infrastructure like execution kernels.

What I Learned

This project was a profound learning experience in several areas:

  • The Power of Prompt Engineering: I learned that working with modern LLMs is less about programming and more about being an effective "teacher" or "director." The quality of the output is a direct reflection of the quality and clarity of the input prompt.
  • One Product, Multiple "Jobs-to-be-Done": I realized that CodeGenius serves different fundamental needs for each user persona. For the analyst, it's an empowerment tool. For the student, it's a learning tool. For the data scientist, it's an efficiency tool. Designing for all three was a fascinating UX challenge.
  • The Future is Generative and Accessible: Building this project solidified my belief that the next wave of software is about abstracting away technical complexity. The future lies in tools that allow users to focus on their "what" (their goal) while the AI handles the "how" (the implementation).

Built With

Share this project:

Updates