Identified Problem
It's estimated that in North America bike theft results in more than $500 million in losses each year. It's also estimated that a bike is stolen every 30 seconds and impacts nearly 2 million cyclists1. Bikes, unlike other forms of transportation, don't require a registration number, and as such are easy to re-sell in on-line marketplaces such as eBay, Craigslist, Kijiji, and Facebook1.
The best tools to combat bike theft are proactive, these include locks, alarms, keeping your bike inside, anything that can be done to prevent the theft from happening. Once a bike is stolen, there are several reactive tools and services that exist to help locate and recover the bike. These include bike registries, such as project529 and BikeIndex where stolen bikes can be cross-listed with found bikes, hidden gps trackers, and law enforcement.
None of the current reactive solutions provide a way to more efficiently locate a stolen bike across the wide array of on-line marketplaces. As such, the process of finding a stolen bike relies heavily on manual search and online bike forums, where users can ask other users to "keep an eye out".
1https://project529.com/garage/org_faq/en/fighting%20bike%20theft/background-on-bike-theft/
Proposed Solution
We want to help fill the gap in stolen bike searchability by creating a bike theft assistant that will help a victim search for their bike across multiple on-line marketplaces using an image of their bike and textual metadata. If the bike is found, the assistant will provide guidance on actions to take, such as alerting the online marketplace that this is a stolen bike or contacting the lister.
This tool differs from current search solutions in that we use vector embeddings of images along with textual metadata to filter and find similar ads listed in online marketplaces based on vector similarity. This is a much more efficient way to find a stolen bike versus keyword search.
How we built it
Refer to the below reference architecture diagram for technologies used. For reference we have also included a list of techs below:
- pinecone: used to store vector embeddings of pictures from ads on bike index
- langchain: used to orchestrate the 'chat like' features of the app
- openai: used as our primary LLM models
- huggingface: leverage this for image embedding and image to text. also deployed app the hugging face spaces
- Amazon S3: used to store the user uploaded image
- Python: language of choice for the app
- streamlit: used for frontend
Challenges we ran into
- Unable to create huggingface inference endpoints
- Maintaining context langchain agents and chains
- Deploying to streamlit cloud/EC2. We ran into compute constraints
Accomplishments that we're proud of
- We built a 'working' product and we stuck to our initial scope and design
- Built a framework that can be easily extended to include more data sources for example
What we learned
- langchain can be incredibly powerful if you learn to use it properly
- learnt lots about langchain agents and when they should be used (ie: tool selection and orchestration)
- learned how to better leverage huggingface (ie: host apps, build endpoints for models)
What's next for Find My Stolen Bike!
- VC funding ;)
Log in or sign up for Devpost to join the conversation.