Our writeup link: https://docs.google.com/document/d/1kZUSSNWma_IBPRwifayx-wgmhBACKGF7YMtgWP3QozU/edit?usp=sharing If having issues to open, please reach out to naicheng_he@brown.edu.
Inspiration
Our work is largely inspired by the research paper titled "Controlling Large Language Models in Systemic Tasks" (https://arxiv.org/abs/2303.17580). The concept of a natural language processor taking the helm within a larger system intrigued us. We envisioned an application where natural language commands could effectively dictate a car's movement, but with a caveat - the system would also have to consider real-time camera input to ensure traffic rules aren't violated.
What It Does
Our system interprets commands such as "move to the left", "move to the right", "stop", etc., and concurrently analyses the input from the camera to ensure adherence to traffic regulations. Once this is done, it autonomously controls the vehicle and then generates an output sentence based on the initial command.
How We Built It
Our system was developed by leveraging the ROSMASTER car model. The entire system is categorized into three integral parts: a Convolutional Neural Network (CNN) classifier for traffic sign recognition, a CNN algorithm for line following, and a Natural Language Processing (NLP) unit that handles prompts and image data. All these are orchestrated by a central controller.
Challenges We Faced
The journey was not without its obstacles. We grappled with the nuances of the Linux system, the idiosyncrasies of the PyTorch framework, and the complexities of bridging hardware and software elements.
Our Accomplishments
Our greatest achievement lies in the successful integration and functionality of all the system components. We take pride in our innovative approach of utilizing a large language model to govern a multi-modal system.
What We Learned
This project was a treasure trove of learning experiences. We honed our skills in PyTorch programming, gained valuable insights into multi-modal coordination, and navigated the complexities of programming within the Linux environment. Moreover, we bridged the gap between hardware and software elements, a vital skill in the evolving technological landscape.
What's Next for NLP Centralized System for Auto-Piloting
We firmly believe in the potential of an intelligent NLP controller to rival traditional Reinforcement Learning methods in auto-piloting. Our future endeavor involves refining our model to incorporate a broader understanding of traffic rules, increase its processing speed, and enhance system security.
Log in or sign up for Devpost to join the conversation.