Inspiration
An educational robot that can give you ecological tour guides about your surrounding ecosystem.
What it does
RoboRanger is a hand-held robot with a camera and earphones. Point it at an animal or a flower and it will tell you, with audio, specific facts about how it lives (diet, behavior, size, lifespan, distinctive traits), and tie those facts to the current location's ecosystem and season. It will also educate you about specific, concrete threats facing this species. Point it at another species, and it can tell you how that species interacts with what you've already seen so far in the tour. You can speak to it, interrupt it mid-lecture, ask it questions. It remembers what it has said, it remembers what you have told it.
How we built it
A camera, a custom vision model, and an AI agent.
At initialization, RoboRanger gets the current latitude and longitude, reverse geocodes to get the name of the location, and gets the current datetime (season). At a press of a button, it will see the wildlife you are looking at and classify species using a quantized vision model locally hosted inside the Arduino UNO Q.
We trained our vision model using iNaturalist's dataset of pictures+species labels. We added a linear head to a pre-trained DINOv2 vision model. DINOv2 itself has already learned general visual features from its pretraining so we simply needed to add a linear head in order to specialize it to the task of classifying animal species based on images. We sped up training by utilizing NVIDIA's brev cloud platform. One challenge was that the model was classifying generic images of sceneries as an animal. Probably because a lot of animal pictures in iNaturalist had natural scenery as their background, ofc. We solved this by training it on an additional of dataset of generic sceneries. We took a bunch of pictures of different places around La Jolla and combined it with a new dataset of scenery images online. These were labeled as scenery. After that, the model correctly classifies when there's no flora or fauna in front of it.
The classification result is then fed into a "Ranger" AI agent, powered by Claude Haiku 4.5. We gave a SYSTEM PROMPT for it to give a two part tour guide. First it will tell you general facts about the species, its interaction with the ecosystem, diet, behavior, etc. Then it will tell you about the environmental problems threatening the species. The agent takes into account the current location + season from the session context on every turn. It is a multi-turn agent with a memory layer (AKA a chatbot): it remembers what species you have seen so far in the tour (hash table lookup), it connects its lectures on new species to species you have seen before. It is voiced using Elevenlabs.
Challenges we ran into
One big problem was figuring out how to fit our vision model into the Arduino UNO Q and making it faster during user sessions. We fixed this by quantizing the model. One challenge was that the model was classifying generic images of sceneries as an animal. Probably because a lot of animal pictures in iNaturalist had natural scenery as their background, ofc. We solved this by training it on an additional of dataset of generic sceneries. We took a bunch of pictures of different places around La Jolla and combined it with a new dataset of scenery images online. These were labeled as scenery. After that, the model correctly classifies when there's no flora or fauna in front of it.
Accomplishments that we're proud of
Getting a CNN to run on an arduino UNO Q.
What we learned
Firmware, hardware debugging. The many pains of doing hardware.


Log in or sign up for Devpost to join the conversation.