Project Story

Inspiration

As Data Science majors, we realized early on that we learned machine learning concepts much more effectively when we were genuinely interested in the data or project theme. While theoretical learning has its place, we found that applying algorithms to projects that aligned with our interests not only made learning more enjoyable but also deepened our understanding of complex concepts.

Most machine learning exercises use generic datasets that don’t resonate with learners' individual passions. This lack of personalization can make it harder for students and professionals alike to stay engaged and motivated. That’s what inspired us to build the ML Notebook Generator, a tool designed to create a personalized learning experience where users can choose both the algorithm and a project theme that interests them. By making the data and the project relevant, we aim to help others experience the same motivation and deeper learning that we did when working on meaningful projects.

How we built it

  • AI Model: Our AI model was fine-tuned using Unsloth’s LoRA of the Llama 3.1 8b Instruct model. We implemented 4-bit quantization to optimize memory usage while maintaining high performance for data generation tasks.
  • Backend: We used Flask to handle user input and manage the entire notebook generation pipeline. SQLite was used to efficiently store user preferences and project themes.
  • Frontend: The frontend was built with ReactJS, styled using Tailwind CSS, and provides a simple, intuitive interface for users to select algorithms and input project themes. The notebooks are then generated and made available for download as .ipynb files.
  • Notebook Assembly: We used Python libraries like nbformat and re to generate executable code and markdown explanations, assembling them into complete Jupyter notebooks.

Challenges we ran into

  • Model Optimization and Scalability: With limited funds and access to GPUs, it was essential for us to optimize our model for both performance and scalability. To achieve this, we had to fine-tune a smaller model using 4-bit quantization. This allowed us to reduce memory usage while still delivering high-quality results.

  • Data Relevance: Generating datasets that were not only realistic but also aligned with the user’s chosen project theme was complex. Ensuring that the synthetic data worked well with the selected algorithm while maintaining its practical value was a key hurdle we had to overcome.

What We Learned

  • Efficient Model Optimization: We learned how to optimize large language models for both performance and scalability, despite our resource constraints. By using Unsloth and implementing LoRA and 4-bit quantization, we were able to deliver powerful AI capabilities with efficient resource usage. This balance between performance and efficiency was crucial to the project’s success.

  • Full-Stack Development: This project enhanced our skills in full-stack development. We successfully integrated a backend built on Flask, a SQLite database for efficient data management, and a frontend created using ReactJS and Tailwind CSS. Building a seamless, user-friendly interface while managing complex machine learning tasks, along with handling user data efficiently, gave us a comprehensive understanding of full-stack development.

During this project, we not only enhanced our technical expertise but also learned how to navigate the practical challenges of developing scalable, efficient, and user-focused machine learning applications.

What's next for ML Notebook Generator

Looking ahead, we have an ambitious roadmap for the ML Notebook Generator. Future developments will include:

  • Expanding Algorithm Options: We plan to add classification algorithms (e.g., Random Forests, Support Vector Machines), clustering techniques (e.g., K-Means), dimensionality reduction methods (e.g., PCA), and advanced regression models (e.g., Ridge, Lasso).
  • Neural Networks: We aim to introduce deep learning architectures to further broaden the scope of the tool.
  • User Feedback: Implementing a feedback mechanism will help us continuously improve the quality of the generated notebooks.
  • Integration with Learning Platforms: We’re also exploring partnerships or integrations with online learning platforms to make our tool accessible to a broader audience.
Share this project:

Updates