A classic Travelling Sales Man (on grid instead of graph) problem with obstacles and cost map.
What it does
Given a list of targets (deal locations), plan a path reaching all the deals with minimum cost.
How I built it
Classic algorithms for solving the Travelling Sales Man problem can be classified as two categories: exact algorithms and heuristic algorithms. The former mainly uses mixed-type integer programming which can be quite costly. Considering the scale of the map and No. of target points, heuristic algorithms seem to be a better choice, moreover, studies show that heuristic algorithm results can be only 2–3% away from the optimal solution.
I tried integer programming first, wanting to find the exact optimal solution. Then realized even the branch and cut algorithm will be too slow to finish on this task.
Then I tried BFS/DFS, A-star and Dijkstra algorithm, the result shows that the speed is actually too slow, even after trying to optimize them. Tried Simulated Annealing and Ant Colony algorithm, and finally decide to use the later one, due to faster speed and slightly more optimal result.
Challenges I ran into
- This problem is actually very hard to solve, since we need to find the optimal order of traversal nodes (nearest neighbor algorithm, Christofides algorithm, heuristic search, iterative pair-wise exchange, etc), based on the weight (cost) between each pair of nodes. However, to find the weight itself is very costly, we need to find the shortest path between each pair of nodes (O(n^2)) using BFS, DFS, A* or Dijkstra algorithm, etc. Jointly solve both is actually impossible.
I found that, essentially this problem can be separated as two sub-problems: 1st, what order of visiting the nodes produce the optimal result; 2nd, between each pair of nodes, which is the costless path (not necessarily be the shortest becuase of the cost map) and the total cost of this path.
The 1st problem needs to use the result of the 2nd problem, however as we notice the 2nd problem need to run n(n+1)/2 pairs where each needs a BFS/A* algorithm..., if a pair of nodes is really far from each other, then each run would take a lot of time.
But if we solve the 1st problem first, get the optimal order, then the work load in 2nd problem is greatly reduced, since we only need to run BFS once to find the shortest path to the next point.
So we need some estimation of the weights (cost) between each pair of node. To do that, we need to make some assumptions on the noise generating process to produce the cost map. There are actually many sampling theory with theoretically proved bounds. The costs along the path between two nodes can be viewed as a continuous function, here I randomly sample several points along the path and get the sample costs, then use a segment constant function to approximate the cost function along the path. This way the cost between any two nodes can be approximated in O(1) time.
The nodes and edge with weights can be viewed as a graph. I turn the above result to a graph.
Then, the ant colony algorithm is running on the graph to find the (near-)optimal traversal order. Hyper-parameter tuning is used to find a good number of ants and best ants.
After we have the optimal order, the 2nd sub-problem is trivial. I tried most of the state-of-art BFS/DFS/A star and Dijkstra and find that the BFS produces the best result. The worst is DFS, which could end in bouncing up and down and move one step to the correct direction after each bouncing (say the start/target points are on the left/right side). A* needs a little tweaking as we have cost map and obstacles, and it requires to estimate the cost required to reach the goal from current point (If there are no costs and obstacles, cost will be the Eucleadian distance to the target).
- The original randomly generated points (deals location) can sometimes reside in a square area (surrounded by obstacles) or even reside "on the obstacle (shelf)" where the adjacent pixels on all four directions (left, right, up, down) are all shelf pixels, there is no way to reach from/to these points. I turned to hand-input some target points and demo my results, rather than using the random generation.
What's next for Walmart Store Visit Plan
During this task, I'm having some ideas about the 3rd task, to create a model to optimize the deals locations, but not have enough time to realize it.
- The result is a zoom-in area focusing on a few points, I was trying to show the whole map with the final path, but turns out the one-pixel-weight path is too thin, some path segments are adjacent to the shelf, and can not visually distinguish them.