Inspiration
Last-mile delivery is one of the biggest challenges in modern logistics. Even though it is the final step in the delivery process, it accounts for a large portion of shipping costs and has a major impact on delivery speed, customer satisfaction, and sustainability. With the rapid growth of e-commerce, companies face increasing challenges such as traffic congestion, inefficient routes, rising fuel costs, and pressure to reduce carbon emissions.
We wanted to explore how data analytics and machine learning could help make delivery systems smarter and more efficient. Rather than viewing routing as only a transportation problem, we approached it from the perspective of operational efficiency, sustainability, and geographic accessibility. Our goal was to show how real-world logistics data can be transformed into actionable insights that improve delivery performance, reduce costs, and support greener delivery operations.
What it does
Our project analyzes and optimizes last-mile delivery routes using geospatial analytics, clustering algorithms, and route optimization techniques. Using Amazon delivery data, the system groups nearby delivery stops into efficient delivery zones with K-Means clustering. After clustering, Google OR-Tools is used to calculate optimized delivery routes within each zone.
The project also evaluates operational and environmental impacts by measuring distance reductions, estimating fuel savings, and calculating potential CO₂ emission reductions. In addition, geospatial visualizations help identify delivery coverage gaps and underserved areas.
Overall, the project demonstrates how data-driven logistics optimization can improve delivery efficiency, reduce operational costs, and support more sustainable last-mile delivery systems.
How we built it
We built the project using Python along with several data science, machine learning, and optimization libraries. We started by cleaning and preprocessing Amazon delivery datasets containing geographic coordinates and delivery stop information.
Technologies Used
- Pandas and NumPy for data cleaning and analysis
- Scikit-learn for K-Means clustering and delivery zone segmentation
- Google OR-Tools for route optimization and shortest-path calculations
- GeoPandas, Folium, and Matplotlib for geospatial analysis and visualization
- Jupyter Notebook for experimentation, modeling, and analysis
The workflow first clusters delivery stops into geographically compact groups and then calculates optimized routes for each cluster to minimize travel distance and inefficiencies. We also incorporated emissions modeling to estimate the environmental impact of optimized routing.
Challenges we ran into
One of the biggest challenges was working with real-world geographic data. Delivery locations were not always evenly distributed, which sometimes caused clustering imbalance where some delivery zones became overcrowded while others remained sparse. Tuning the clustering model to create realistic delivery territories required significant experimentation.
Another challenge was route optimization complexity. As the number of delivery stops increased, optimization calculations became more computationally expensive, so we had to balance performance with optimization accuracy.
We also faced challenges interpreting delivery equity and identifying underserved neighborhoods because inefficiencies are not always immediately visible in raw geographic data. Additional visualization and geospatial analysis were necessary to generate meaningful operational insights.
Accomplishments that we're proud of
We are proud that our project combines machine learning, optimization algorithms, and geospatial analytics into a single practical logistics solution. Instead of focusing only on route distance, we expanded the analysis to include sustainability and service accessibility, making the project more reflective of real-world logistics challenges.
We are also proud of successfully implementing OR-Tools route optimization with clustered delivery zones and creating visualizations that clearly demonstrate efficiency improvements. Estimating both operational savings and environmental impact added stronger real-world relevance to the project.
Most importantly, we built a solution that demonstrates how data-driven logistics optimization can support smarter, greener, and more efficient delivery systems.
What we learned
Through this project, we gained hands-on experience working with real-world logistics and geospatial datasets, which are often much messier and more complex than classroom datasets. We learned how clustering algorithms like K-Means can be applied to operational problems such as delivery zone segmentation.
We also learned how optimization tools like Google OR-Tools can significantly improve delivery efficiency when combined with geographic clustering. In addition, the project deepened our understanding of the relationship between logistics, sustainability, operational cost management, and customer experience.
Beyond technical skills, we learned the importance of balancing mathematical optimization with practical business considerations such as scalability, driver workload, and service fairness.
What's next for Last-Mile Route Optimization Analysis
In the future, we want to expand the project by incorporating real-time traffic conditions, weather data, and dynamic delivery requests to create adaptive route optimization instead of static route planning.
We also want to explore reinforcement learning and AI-based predictive routing models that can continuously improve delivery efficiency over time. Additional future improvements include integrating vehicle capacity constraints, delivery time windows, and electric vehicle routing considerations to make the model more realistic and environmentally focused.
Another goal is to develop an interactive dashboard where logistics operators can visualize routes, monitor emissions reductions, and identify high-inefficiency zones in real time.
Ultimately, we hope to evolve this project into a scalable smart logistics platform capable of supporting more efficient, sustainable, and equitable last-mile delivery systems.
Built With
- html
- javascript
- jupyter
- k-means
- machine-learning
- matplotlib
- numpy
- pandas
- python
- scikit-learn
- vercel

Log in or sign up for Devpost to join the conversation.