Inspiration

SoC SPOH DAAA Diploma Station showcase: Enhancing YOLO Object Detection with Generative AI Live Object Detection with YOLO In several past SP Open House, our Diploma in Applied AI & Analytics (DAAA) station had showcased live object detection using YOLO (You Only Look Once) model.

It typically started with students describing what they learned in ST1504 Deep Learning on image classification.

DAAA students excitedly shared their Deep Learning CA1 project on image classification using CNN.

Whilst what they learned were classification on static images, realistic applications often require detection of multiple objects in real-time video streams.

Real-time object detection using YOLO model, like a self-driving car detecting pedestrians and traffic signs.

Next, students would share that their Deep Learning module served as a strong foundation for some advanced computer vision tasks. They would conclude with a live demonstration of a YOLO model.

Example of live object detection demo using YOLO model.

Source: https://stackabuse.com/real-time-object-detection-inference-in-python-with-yolov7

What it does

Inclusion of a new Generative AI with Large Language Models module in our DAAA curriculum With the recent advancements in Generative AI and Large Language Models (LLMs), our DAAA curriculum has been updated to include a new module, Generative AI with Large Language Models. It will be rolled out as an elective initially, and subsequently integrated into our core curriculum.

Hence, there are some intriguing ideas of further enhancing the YOLO object detection DAAA demo station with Generative AI capabilities.

Real-time captioning of detected objects using LLMs. Upon having a video (e.g. a sports match) as input, after detecting objects using YOLO, this video can be passed into an LLM to automatically generate commentaries. Think about the exhilarating possibilities! ☕ Our DAAA SPOH showcase may wow prospective students and parents, while also demonstrating the cutting-edge skills our DAAA students acquire!

Potential application Just a few years ago, Dr Wilson Qiu had a IdeaFarm project for some DAAA students where they developed YOLO for our DAAA AI and Analytical Colab security cameras to detect food being consumed inside.

DAAA students developed YOLO model to detect food being consumed inside DAAA AI and Analytical Colab, as part of an IdeaFarm project supervised by Dr Wilson Qiu.

How we built it

One possible application of enhancing YOLO with Generative AI is to generate real-time reports to notify staff on what food was brought in with time-stamp and other details. This can help with monitoring and ensuring compliance with DAAA AI and Analytical Colab's food policies, with efficient automated reporting!

Challenges we ran into

Finetuning the model to work the best. Setup environment.

Accomplishments that we're proud of

It works at least!

What we learned

Yolo and LLM and TTS.

Application Link

https://yolo-llm-acg.streamlit.app/

User Name: SoC

Password: 2025

Built With

Share this project:

Updates