Local Search with Simulated Annealing from Scratch

Author:Murphy | View: 27952 | Time: 2025-03-23 18:57:46

Temperature, an important part of simulated annealing. Image by Dall-E 2.

In some of my previous posts, I explained heuristics and how you can use them to find good quality solutions for a mathematical optimization problem. In this post, I will provide generic Python code for local search together with simulated annealing. Besides generic code, there are implementations for three classic example problems: the traveling salesman problem, the Knapsack problem and the Rastrigin function.

A short refresher: local search is a heuristic that tries to improve a given solution by looking at neighbors. If the objective value of a neighbor is better than the current objective value, the neighbor solution is accepted and the search continues. Simulated annealing allows worse solutions to be accepted, this makes it possible to escape local minima.

Simulated Annealing Generic Code

The code works as follows: we are going to create four code files. The most important one is sasolver.py, this file contains the generic code for Simulated Annealing. The problems directory contains three examples of optimization problems that we can run to test the SA solver.

This is the folder structure:

For solving a problem with simulated annealing, we start to create a class that is quite generic:

import copy
import logging
import math
import numpy as np
import random
import time

from problems.knapsack import Knapsack
from problems.rastrigin import Rastrigin
from problems.tsp import TravelingSalesman

class SimulatedAnnealing():
    def __init__(self, problem):
        self.problem = problem

    def run_sa(self, max_iterations: int=100000, update_iterations: int=10000, time_limit: int=60, cooling_schedule: str='lin'):
        start = time.time()
        best_solution = self.problem.baseline_solution()
        best_obj = self.problem.score_solution(best_solution)
        logging.info(f"First solution.        Objective: {round(best_obj, 2)} Solution: {best_solution}")
        initial_temp = best_obj
        prev_solution = copy.deepcopy(best_solution)
        prev_obj = best_obj

        iteration = 0
        last_update = 0
        while time.time() - start < time_limit:
            iteration += 1
            last_update += 1
            accept = False

            curr_solution = self.problem.select_neighbor(copy.deepcopy(prev_solution))
            curr_obj = self.problem.score_solution(curr_solution)

            temperature = self._calculate_temperature(initial_temp, iteration, max_iterations, cooling_schedule)
            acceptance_value = self._acceptance_criterion(curr_obj, prev_obj, temperature)

            if (curr_obj <= prev_obj) or (temperature > 0 and random.random() < acceptance_value):
                accept = True

            if curr_obj < best_obj:
                best_solution = copy.deepcopy(curr_solution)
                best_obj = curr_obj
                prev_solution = copy.deepcopy(curr_solution)
                prev_obj = curr_obj
                last_update = 0
                logging.info(f"Better solution found. Objective: {round(best_obj, 2)} Solution: {curr_solution}")
            else:
                if accept:
                    prev_obj = curr_obj
                    prev_solution = copy.deepcopy(curr_solution)
                    last_update = 0

            if last_update >= update_iterations:
                break

        logging.info(f"Final solution: {best_solution} Objective: {round(best_obj, 2)}")
        return best_solution

    @staticmethod
    def _acceptance_criterion(obj_new, obj_curr, temperature, mod=1):
        """
        Determine the acceptance criterion (threshold for accepting a solution that is worse than the current one)
        """
        diff = obj_new - obj_curr
        try:
            acc = math.exp(-diff / temperature)
        except OverflowError:
            acc = -1
        return acc

    @staticmethod
    def _calculate_temperature(initial_temp: int, iteration: int, max_iterations: int, how: str = None) -> float:
        """
        Decrease the temperature to zero based on total number of iterations.
        """
        if iteration >= max_iterations:
            return -1
        if how == "exp":
            cooling_rate = 0.95
            return initial_temp * (cooling_rate**iteration)
        elif how == "quadratic":
            cooling_rate = 0.01
            return initial_temp / (1 + cooling_rate * iteration**2)
        elif how == "log":
            cooling_rate = 1.44
            return initial_temp / (1 + cooling_rate * np.log(1 + iteration))
        elif how == "lin mult":
            cooling_rate = 0.1
            return initial_temp / (1 + cooling_rate * iteration)
        else:
            return initial_temp * (1 - iteration / max_iterations)

if __name__ == '__main__':
    problem = 'rastrigin'  # choose one of knapsack, tsp, rastrigin
    logging.basicConfig(filename=f'{problem}.log', encoding='utf-8', level=logging.INFO)
    if problem == 'tsp':
        problem = TravelingSalesman(n_locations=10, height=100, width=100)
        sa = SimulatedAnnealing(problem)
        final_solution = sa.run_sa()
        problem._plot_solution(final_solution, title='final')
    elif problem == 'knapsack':
        problem = Knapsack(knapsack_capacity=100, n_items=10)
        sa = SimulatedAnnealing(problem)
        final_solution = sa.run_sa()
    elif problem == 'rastrigin':
        problem = Rastrigin(n_dims=2) 
        sa = SimulatedAnnealing(problem)
        final_solution = sa.run_sa()

This file is sasolver.py. It takes a problem as input, and then you can solve the problem with simulated annealing, run_sa(). There are different ways to handle cooling, implemented in _calc_temperature. The acceptance value is calculated based on the metropolis acceptance criterion.

By modifying the problem = 'tsp' line, (below if __name__ == '__main__':,) it's possible to select another problem (replace tsp by knapsack or rastrigin).

We need to have three methods in the example problems to make this code work:

baseline_solution() This method creates the first solution (starting point) for a problem.
score_solution(solution) The score_solution method calculates the objective value.
select_neighbor(solution) We need to apply local moves to the solutions and select a neighbor, this will be implemented in this method.

We are going to implement these three methods for three problems: Traveling Salesman, knapsack and the Rastrigin function.

Example 1. Traveling Salesman

The first problem we are going to look at is the traveling salesman problem. In this problem, there are locations that need to be visited. The goal is to minimize the distance traveled. Below you can see an example:

Example: 10 locations we want to visit and minimize the distance. Image by author.

import matplotlib.pyplot as plt
import numpy as np
import random
from typing import List

class TravelingSalesman():
    def __init__(self, n_locations: int = 10, locations: List[tuple] = None, height: int = 100, width: int = 100, starting_point: int=0):
        self.name = 'traveling salesman'
        self.starting_point = starting_point
        self.height = height
        self.width = width
        if locations is None:
            locations = self._create_sample_data(n_locations)
        self.locations = locations
        self.n_locations = len(locations)
        self.distances = self._create_distances()

    def baseline_solution(self) -> list:
        # route that follows the locations list
        # start and end in start location
        baseline = [self.starting_point] + [i for i in range(self.n_locations) if i != self.starting_point] + [self.starting_point]
        self._plot_solution(baseline, title='baseline')
        self._plot_solution(baseline, title='dots', only_dots=True)
        return baseline

    def score_solution(self, solution: list) -> float:
        # add all distances
        return sum([self.distances[node, solution[i+1]] for i, node in enumerate(solution[:-1])])

    def select_neighbor(self, solution: list) -> list:
        # swap two locations (don't swap start and end)
        indici = random.sample(range(1, self.n_locations), 2)
        idx1, idx2 = indici[0], indici[1]
        value1, value2 = solution[idx1], solution[idx2]
        solution[idx1] = value2
        solution[idx2] = value1
        return solution

    def _create_sample_data(self, n_locations: int) -> List[tuple]:
        return [(random.random() * self.height, random.random() * self.width) for _ in range(n_locations)]

    def _plot_solution(self, solution: list, title: str = 'tsp', only_dots: bool = False):
        plt.clf()
        plt.rcParams["figure.figsize"] = [5, 5]
        plt.rcParams["figure.autolayout"] = True
        for n, location_id1 in enumerate(solution[:-1]):
            location_id2 = solution[n+1]
            x_values = [self.locations[location_id1][0], self.locations[location_id2][0]]
            y_values = [self.locations[location_id1][1], self.locations[location_id2][1]]
            if not only_dots:
                plt.plot(x_values, y_values, 'bo', linestyle="-")
            else:
                plt.plot(x_values, y_values, 'bo')
            plt.text(x_values[0]-2, y_values[0]+2, str(location_id1))
        plt.savefig(f'{title}')

    def _create_distances(self) -> np.array:
        distances = np.zeros(shape=(self.n_locations, self.n_locations))
        for ni, i in enumerate(self.locations):
            for nj, j in enumerate(self.locations):
                distances[ni, nj] = self._distance(i[0], i[1], j[0], j[1])
        return distances

    @staticmethod
    def _distance(x1: float, y1: float, x2: float, y2: float) -> float:
        return np.sqrt((x2 - x1)**2 + (y2 - y1)**2)

In this problem, the baseline solution is created by visiting the locations in sequence (0 to 9). For the example, it gives us this route:

This doesn't look optimal, and it isn't. A local move is defined by swapping two locations. The score of the solution is the distance we need to travel. After running simulated annealing, this is the final solution:

That looks better!

For small problems, this works okay (still not recommended). For larger ones, there are better solutions and algorithms available, for example the Lin-Kernighan heuristic. What also helps is a better starting solution, e.g. a greedy algorithm.

Example 2. Knapsack

The knapsack problem is a classic one, but for those who don't know it, here follows an explanation.

Imagine you are in a cave full of beautiful treasures. Due to some unforeseen circumstances the cave is collapsing. You have time to fill your knapsack with treasures and then you need to run away to safety. Of course, you want to take the items with you that together bring most value. What items should you take?

The knapsack problem. The knapsack has a capacity of 50. What items should you select to maximize the value? Image by author.

The data you need to have for solving this problem is the capacity of the knapsack, the capacity needed for the items and the value of the items.

Below the code that defines this problem:

import copy
import random
import numpy as np
from typing import List

class Knapsack():
    def __init__(self, knapsack_capacity: int, n_items: int = 20, item_values: list = None, item_capacities: list = None):
        self.name = 'knapsack'
        self.knapsack_capacity = knapsack_capacity
        if item_values is None and item_capacities is None:
            item_values, item_capacities = self._create_sample_data(n_items)
        self.item_values = item_values
        self.item_capacities = item_capacities
        self.n_items = len(item_values)

    def baseline_solution(self) -> list:
        # select random items until the knapsack is full
        capacity = 0
        solution = []
        while True:
            selected = random.choice([i for i in range(self.n_items) if i not in solution])
            if capacity + self.item_capacities[selected] > self.knapsack_capacity:
                break
            else:
                solution.append(selected)
                capacity += self.item_capacities[selected]
        return solution

    def score_solution(self, solution: list) -> int:
        # count the total value of this solution
        return -1 * sum([self.item_values[i] for i in solution])

    def select_neighbor(self, solution: list) -> list:
        # local move: remove / add / swap items
        solution_capacity = sum([self.item_capacities[i] for i in solution])
        possible_to_add = [i for i in range(self.n_items) if self.item_capacities[i] <= self.knapsack_capacity - solution_capacity and i not in solution]
        if len(solution) == 0:
            move = 'add'
        elif len(possible_to_add) > 0:
            move = np.random.choice(['remove', 'add', 'swap'], p=[0.1, 0.6, 0.3])
        else:
            move = np.random.choice(['remove', 'swap'], p=[0.4, 0.6])
        while True:
            if move == 'remove':
                solution.pop(random.randrange(len(solution)))
                return solution
            elif move == 'add':
                new_solution = copy.deepcopy(solution)
                new_item = random.choice(possible_to_add)
                new_solution.append(new_item)
                return new_solution
            elif move == 'swap':
                n = 0
                while n < 50:
                    new_solution = copy.deepcopy(solution)
                    in_item = random.choice([i for i in range(self.n_items) if i not in solution])
                    out_item = random.choice(range(len(solution)))
                    new_solution.pop(out_item)
                    new_solution.append(in_item)
                    n += 1
                    if self._is_feasible(new_solution):
                        return new_solution
                move = 'remove'

    def _create_sample_data(self, n_items: int) -> List[list]:
        item_values = random.sample(range(2, 1000), n_items)
        item_capacities = random.sample(range(1, self.knapsack_capacity), n_items)
        return item_values, item_capacities

    def _is_feasible(self, solution: list) -> bool:
        return sum([self.item_capacities[i] for i in solution]) <= self.knapsack_capacity

The baseline solution selects an item at random until the knapsack is full. The solution score is the sum of values of the items in the knapsack, multiplied by -1. This is necessary because the SA solver minimizes a given objective. In this situation, there are three local moves possible: adding an item, removing an item or swapping two items. This makes it possible to reach every solution possible in solution space. If we swap an item, we need to check if the new solution is feasible.

In the next image you can see a sample run log file. There are 10 items we need to choose from. On top the item values, below the capacity the items take, and on the third line the value densities (item value divided by item capacity). Then the solution process starts. The solution contains the index number(s) of the selected items. In the final solution, items 4, 5 and 8 are selected (counting starts at 0):

Example 3. Rastrigin Function

A function that is used often to test optimization algorithms, is the Rastrigin function. In 3D it looks like this:

Rastrigin function 3D plot. Image by author.

It has many local optima. The goal is to find the global minimum, which is at coordinate (0, 0). It is easier to see in a contour plot:

Contour plot of the Rastrigin function. Image by author.

The landscape consists of many local optima with the highest ones in the four corners and the lowest ones in the center.

We can try to find the global minimum with simulated annealing. This problem is continuous instead of discrete, and we want to find the values for x and y that minimize the Rastrigin function.

The Rastrigin function is defined with a n-dimensional domain as:

Let's try to find the optimum for the function with three dimensions (x, y, and z). The domain is defined by x and y, so the problem is exactly as the plots above.

from collections import Counter
import numpy as np
import random
from typing import List

class Rastrigin():
    def __init__(self, n_dims: int = 2):
        self.name = 'rastrigin'
        self.n_dims = n_dims

    def baseline_solution(self) -> list:
        solution = [random.uniform(-5.12, 5.12) for _ in range(self.n_dims)]
        return solution

    def score_solution(self, solution: list) -> float:
        score = self.n_dims * 10 + sum([(x**2 - 10*np.cos(2*np.pi*x)) for x in solution])
        return score

    def select_neighbor(self, solution: list, step_size: float = 0.1) -> list:
        perturbation = step_size * np.random.randn(self.n_dims)
        neighbor = solution + perturbation
        while not self._is_feasible(neighbor):
            perturbation = step_size * np.random.randn(self.n_dims)
            neighbor = solution + perturbation    
        return neighbor

    def _is_feasible(self, solution: list) -> bool:
        return bool([x >= -5.12 and x <= 5.12 for x in solution])

For the baseline solution, we select a random float for x and y between -5.12 and 5.12. The score of the solution is equal to z (the outcome of the Rastrigin function). A neighbor is selected by taking a step into a random direction with a step size set to 0.1. The feasibility check is done to make sure we stay in the domain.

A log of a run:

The final solution comes really close to the optimum.

But watch out, if you run the algorithm with more dimensions, it's not guaranteed that you find the optimum:

As you can see, the final solution is a local optimum instead of the global one. It find goods coordinates for the first two variables, but the third one is equal to 0.985, which is far away from 0. It's important to verify the results you get. This specific example will work well by finetuning the SA parameters, but for more dimensions you should probably use another optimization technique that performs better.

Conclusion

In this post, we coded the implementation of simulated annealing. With three examples, you have an understanding of different possibilities with it. You only need to implement three methods for a new problem to make it work, these are baseline_solution, score_solution and select_neighbor. Of course, this implementation is a basic one, and if you want to use it you need to tune the parameters of SA and make sure that the algorithm returns a feasible solution. What can help a lot in final solution quality is a good initial solution and investigating the best way to select a neighbor solution.

Thanks for reading, and see you next time!

An Introduction to a Powerful Optimization Technique: Simulated Annealing

Mathematical Optimization Heuristics Every Data Scientist Should Know

Five ways to combine Mathematical Optimization and Machine Learning

Tags: Editors Pick Knapsack Mathematical Optimization Simulated Annealing Traveling Salesman