Shortest Path Algorithms: How to Use Data to Navigate and Optimize

Author:Murphy | View: 25211 | Time: 2025-03-22 20:37:33

Image by [Unsplash](https://unsplash.com/)+ on Unsplash

Have you ever thought about how your GPS always seems to find the fastest route to your destination? No matter how many ways there are to get from Point-A to Point-B, your GPS sorts through them all and gives you just one route – how does it know which route is the best one? In the background, tons of calculations are being made using algorithms like Dijkstra's to find the shortest path between where you are and where you're trying to go. There are many different algorithms that can achieve this though, and I'd like to introduce you to a couple of them! In this article, I'll take a look at a popular shortest path algorithm along with a more advanced one and show how you can use them for your data projects or just for fun!

Dijkstra's Algorithm

I could write a little summary of how Dijkstra's works, but I would highly recommend watching this YouTube video by Spanning Tree first.

If you don't want to watch it, here's the gist:

You've got a collection of points and want to figure out the shortest path between any two of them. For example if you want to go from Point S to Point P the shortest and only path is 2 minutes. However if you want to get from Point P to Point W there are a couple of ways you can go there. If you go from P to X to V to W that journey will take you 16 minutes (Route 1), alternatively you can go from P to X to Y to W and that journey will only take you 13 minutes (Route 2). Dijkstra's Algorithm basically does all those calculations but on a much larger scale.

Image from Wikimedia Commons edited by Author

Here's what that would look like in python:

First, create a dictionary that stores each point, the points that that point can travel to, and the distance between those points. For the image above that would look like:

path = {
'S': {'P':2,'U':3},
'U': {'S':3,'X':1,'V':3},
'P': {'S':2,'X':4,'Q':5},
'X': {'P':4,'U':1,'Q':7,'Y':6,'V':8},
'Q': {'P':5,'X':7,'R':2},
'Y': {'X':6,'R':1,'W':3},
'V': {'U':3,'X':8,'W':4},
'W': {'V':4,'Y':3,'T':5},
'R': {'Q':2,'Y':3,'T':6},
'T': {'R':6,'W':5}
}

Then we would just need a function to run through each combination for this path.

def dijkstra_example(path, start_vertex, end_vertex):
    max_value = 10**6 #assume all edge weights are smaller than this value
    distances = {vertex: max_value for vertex in graph}
    distances[start_vertex] = 0

    #initialize a priority queue and add the start vertex with distance 0
    priority_queue = [(0, start_vertex)]

    while priority_queue:
        #get the vertex with the smallest distance
        current_distance, current_vertex = heapq.heappop(priority_queue)

        #if the current vertex is the end vertex, return the distance
        if current_vertex == end_vertex:
            return distances[end_vertex]

        #if the distance is greater than the recorded distance, skip it
        if current_distance > distances[current_vertex]:
            continue

        #explore neighbors of the current vertex
        for neighbor, weight in path[current_vertex].items():
            distance = current_distance + weight

            #if a shorter path is found
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(priority_queue, (distance, neighbor))

    return 100  #return 100 if you end vertex isn't reachable

That function will output the shortest path between two start_vertex and end_vertex and you can use values from our path dictionary to test it! On a larger scale, like a dictionary with thousands of points, the ability to do this instantly with python is very useful.

In general, Dijkstra's is a solid go-to algorithm for determining the shortest path between two points. There are some other considerations to make though when picking an algorithm for shortest path problems. For example if you ever have to deal with negative weights. Negative weights might not make sense at first but if you think about it like when driving you can either do the shortest path or the path that costs the least amount of money (no tolls), a negative weight might represent the reduction of cost of a certain path. If you're working with negative weights for reductions or anything like that you should consider the Bellman-Ford Algorithm.

Bellman-Ford Algorithm + Edge Relaxation

This algorithm works similarly to Dijkstra's Algorithm but it can be more flexible for handling real-world problems where data might be messier. It operates on the principle of "edge relaxation" which basically means if a known path from one point to another can be improved by going through an adjacent point then that know shortest path is updated. Edge Relaxation exists for Djikstra's Algorithm as well but the main difference between the two is that Bellman-Ford can handle negative weights.

An example of how you could use the Bellman-Form Algorithm would be if you were working on a logistics/supply chain problem. Companies will often need to transport goods between warehouses, suppliers, and store fronts – each one of these routes have associated costs which can add up due to all kinds of factors (weights!). The Bellman-Ford Algorithm can help identify the most efficient route from a cost perspective and also find any inefficiencies or opportunities to save on costs.

A couple of things to consider:

Each location would be represented as a vertex.
In some cases, routes might have discounts that can reduce the transportation cost which would result in a negative edge weight.

The implementation of this algorithm would involve:

Initializing distances from the start location to other locations. At first, we set all distances to infinity to that all nodes appear to be unreachable at first. This is just for set up though and will update during edge relaxation.
Relaxing all edges for X-1 times where X is the number of locations.
Check for negative weights cycles.

To visualize a scenario like this, see the image below:

And to do something like this in Python, it would look like this:

Path = {
    'A': {'B': 5, 'C': 2},
    'B': {'C': 1, 'D': 3},
    'C': {'D': 8},
    'D': {'A': 7, 'B': -2}
}

def bellman_ford(path, start_vertex):
    #initialize distances from start_vertex to all other vertices as infinity
    max_value = float('inf')
    distances = {vertex: max_value for vertex in path}
    distances[start_vertex] = 0

    #relax all edges |x| - 1 times
    for x in range(len(path) - 1):
        for vertex in path:
            for neighbor, weight in graph[vertex].items():
                if distances[vertex] != max_value and distances[vertex] + weight < distances[neighbor]:
                    distances[neighbor] = distances[vertex] + weight

    #check for negative-weight cycles
    for vertex in path:
        for neighbor, weight in path[vertex].items():
            if distances[vertex] != max_value and distances[vertex] + weight < distances[neighbor]:
                raise ValueError("Graph contains a negative weight cycle")

    return distances

shortest_distances = bellman_ford(graph, 'A')
print(shortest_distances)  #output the shortest distances from 'A' to all other vertices

Now I bet you're wondering "what is the ValueError? I thought Bellman-Ford could handle negative weights??" and you would be right! The issue is not from negative wights, but negative weight cycles.

Negative Weights are individual edges that can be weighted with a negative value to represent the opposite of a positive weighted graph. Think positive weights cost more and negative weights give you a discount.
Negative Weight Cycles happen when a path that start and end at the same vertex have a negative sum of weights. Which basically means that every time you go from start to end you will continue to reduce the cost infinitely so the code will just continue looping forever to make the cost as small as possible. This should raise an error!

As you can see, this stuff can get pretty complicated, but if you ever run into an error like this it's probably a data input error that can easily be fixed.

Project Ideas

Now that you have an idea of how these algorithms work, I encourage you to use them to make some fun projects! Here are a couple of examples you can add to show off in your Data Science portfolio.

Optimal Delivery Route Planner

You're going to build an application to find the optimal route for delivering within a city, minimizing distance travelled. You can find data on OpenStreetMap and host the app on Streamlit. Identify delivery points on your map and label these nodes in your data. You can then use Djikstra's Algorithm to find the shortest point between each path – or you can identify a starting point and use the algorithm from there.

Flight Price Optimization Tool

You can build a tool to find the cheapest flights between cities, you can also increase the complexity by considering direct and flights with stops with the possibility of price changes (which can represent positive AND negative weights). Data can be found from sources like Skyscanner and Streamlit again is a great option for hosting the app. The Bellman-Ford Algorithm can be used to find the optimal route based on several different features – this could be a handy tool to use along with a great one to show in your portfolio!

Conclusion

I hope reading this article gave you a better idea of how shortest path algorithm works and inspired you to use one of them in your next project. There are tons of interesting use case for these algorithms and I enjoyed reading more about them for this article as well.

Thank you for reading!

Tags: Data Analysis Data Science GIS Mapping Python