Local Search with Simulated Annealing from Scratch

In some of my previous posts, I explained heuristics and how you can use them to find good quality solutions for a mathematical optimization problem. In this post, I will provide generic Python code for local search together with simulated annealing. Besides generic code, there are implementations for three classic example problems: the traveling salesman problem, the Knapsack problem and the Rastrigin function.
A short refresher: local search is a heuristic that tries to improve a given solution by looking at neighbors. If the objective value of a neighbor is better than the current objective value, the neighbor solution is accepted and the search continues. Simulated annealing allows worse solutions to be accepted, this makes it possible to escape local minima.
Simulated Annealing Generic Code
The code works as follows: we are going to create four code files. The most important one is sasolver.py
, this file contains the generic code for Simulated Annealing. The problems directory contains three examples of optimization problems that we can run to test the SA solver.
This is the folder structure:

For solving a problem with simulated annealing, we start to create a class that is quite generic:
import copy
import logging
import math
import numpy as np
import random
import time
from problems.knapsack import Knapsack
from problems.rastrigin import Rastrigin
from problems.tsp import TravelingSalesman
class SimulatedAnnealing():
def __init__(self, problem):
self.problem = problem
def run_sa(self, max_iterations: int=100000, update_iterations: int=10000, time_limit: int=60, cooling_schedule: str='lin'):
start = time.time()
best_solution = self.problem.baseline_solution()
best_obj = self.problem.score_solution(best_solution)
logging.info(f"First solution. Objective: {round(best_obj, 2)} Solution: {best_solution}")
initial_temp = best_obj
prev_solution = copy.deepcopy(best_solution)
prev_obj = best_obj
iteration = 0
last_update = 0
while time.time() - start < time_limit:
iteration += 1
last_update += 1
accept = False
curr_solution = self.problem.select_neighbor(copy.deepcopy(prev_solution))
curr_obj = self.problem.score_solution(curr_solution)
temperature = self._calculate_temperature(initial_temp, iteration, max_iterations, cooling_schedule)
acceptance_value = self._acceptance_criterion(curr_obj, prev_obj, temperature)
if (curr_obj <= prev_obj) or (temperature > 0 and random.random() < acceptance_value):
accept = True
if curr_obj < best_obj:
best_solution = copy.deepcopy(curr_solution)
best_obj = curr_obj
prev_solution = copy.deepcopy(curr_solution)
prev_obj = curr_obj
last_update = 0
logging.info(f"Better solution found. Objective: {round(best_obj, 2)} Solution: {curr_solution}")
else:
if accept:
prev_obj = curr_obj
prev_solution = copy.deepcopy(curr_solution)
last_update = 0
if last_update >= update_iterations:
break
logging.info(f"Final solution: {best_solution} Objective: {round(best_obj, 2)}")
return best_solution
@staticmethod
def _acceptance_criterion(obj_new, obj_curr, temperature, mod=1):
"""
Determine the acceptance criterion (threshold for accepting a solution that is worse than the current one)
"""
diff = obj_new - obj_curr
try:
acc = math.exp(-diff / temperature)
except OverflowError:
acc = -1
return acc
@staticmethod
def _calculate_temperature(initial_temp: int, iteration: int, max_iterations: int, how: str = None) -> float:
"""
Decrease the temperature to zero based on total number of iterations.
"""
if iteration >= max_iterations:
return -1
if how == "exp":
cooling_rate = 0.95
return initial_temp * (cooling_rate**iteration)
elif how == "quadratic":
cooling_rate = 0.01
return initial_temp / (1 + cooling_rate * iteration**2)
elif how == "log":
cooling_rate = 1.44
return initial_temp / (1 + cooling_rate * np.log(1 + iteration))
elif how == "lin mult":
cooling_rate = 0.1
return initial_temp / (1 + cooling_rate * iteration)
else:
return initial_temp * (1 - iteration / max_iterations)
if __name__ == '__main__':
problem = 'rastrigin' # choose one of knapsack, tsp, rastrigin
logging.basicConfig(filename=f'{problem}.log', encoding='utf-8', level=logging.INFO)
if problem == 'tsp':
problem = TravelingSalesman(n_locations=10, height=100, width=100)
sa = SimulatedAnnealing(problem)
final_solution = sa.run_sa()
problem._plot_solution(final_solution, title='final')
elif problem == 'knapsack':
problem = Knapsack(knapsack_capacity=100, n_items=10)
sa = SimulatedAnnealing(problem)
final_solution = sa.run_sa()
elif problem == 'rastrigin':
problem = Rastrigin(n_dims=2)
sa = SimulatedAnnealing(problem)
final_solution = sa.run_sa()
This file is sasolver.py
. It takes a problem as input, and then you can solve the problem with simulated annealing, run_sa()
. There are different ways to handle cooling, implemented in _calc_temperature
. The acceptance value is calculated based on the metropolis acceptance criterion.
By modifying the problem = 'tsp'
line, (below if __name__ == '__main__':
,) it's possible to select another problem (replace tsp
by knapsack
or rastrigin
).
We need to have three methods in the example problems to make this code work:
-
baseline_solution()
This method creates the first solution (starting point) for a problem. -
score_solution(solution)
Thescore_solution
method calculates the objective value. -
select_neighbor(solution)
We need to apply local moves to the solutions and select a neighbor, this will be implemented in this method.
We are going to implement these three methods for three problems: Traveling Salesman, knapsack and the Rastrigin function.
Example 1. Traveling Salesman
The first problem we are going to look at is the traveling salesman problem. In this problem, there are locations that need to be visited. The goal is to minimize the distance traveled. Below you can see an example:

import matplotlib.pyplot as plt
import numpy as np
import random
from typing import List
class TravelingSalesman():
def __init__(self, n_locations: int = 10, locations: List[tuple] = None, height: int = 100, width: int = 100, starting_point: int=0):
self.name = 'traveling salesman'
self.starting_point = starting_point
self.height = height
self.width = width
if locations is None:
locations = self._create_sample_data(n_locations)
self.locations = locations
self.n_locations = len(locations)
self.distances = self._create_distances()
def baseline_solution(self) -> list:
# route that follows the locations list
# start and end in start location
baseline = [self.starting_point] + [i for i in range(self.n_locations) if i != self.starting_point] + [self.starting_point]
self._plot_solution(baseline, title='baseline')
self._plot_solution(baseline, title='dots', only_dots=True)
return baseline
def score_solution(self, solution: list) -> float:
# add all distances
return sum([self.distances[node, solution[i+1]] for i, node in enumerate(solution[:-1])])
def select_neighbor(self, solution: list) -> list:
# swap two locations (don't swap start and end)
indici = random.sample(range(1, self.n_locations), 2)
idx1, idx2 = indici[0], indici[1]
value1, value2 = solution[idx1], solution[idx2]
solution[idx1] = value2
solution[idx2] = value1
return solution
def _create_sample_data(self, n_locations: int) -> List[tuple]:
return [(random.random() * self.height, random.random() * self.width) for _ in range(n_locations)]
def _plot_solution(self, solution: list, title: str = 'tsp', only_dots: bool = False):
plt.clf()
plt.rcParams["figure.figsize"] = [5, 5]
plt.rcParams["figure.autolayout"] = True
for n, location_id1 in enumerate(solution[:-1]):
location_id2 = solution[n+1]
x_values = [self.locations[location_id1][0], self.locations[location_id2][0]]
y_values = [self.locations[location_id1][1], self.locations[location_id2][1]]
if not only_dots:
plt.plot(x_values, y_values, 'bo', linestyle="-")
else:
plt.plot(x_values, y_values, 'bo')
plt.text(x_values[0]-2, y_values[0]+2, str(location_id1))
plt.savefig(f'{title}')
def _create_distances(self) -> np.array:
distances = np.zeros(shape=(self.n_locations, self.n_locations))
for ni, i in enumerate(self.locations):
for nj, j in enumerate(self.locations):
distances[ni, nj] = self._distance(i[0], i[1], j[0], j[1])
return distances
@staticmethod
def _distance(x1: float, y1: float, x2: float, y2: float) -> float:
return np.sqrt((x2 - x1)**2 + (y2 - y1)**2)
In this problem, the baseline solution is created by visiting the locations in sequence (0 to 9). For the example, it gives us this route:

This doesn't look optimal, and it isn't. A local move is defined by swapping two locations. The score of the solution is the distance we need to travel. After running simulated annealing, this is the final solution:

That looks better!
For small problems, this works okay (still not recommended). For larger ones, there are better solutions and algorithms available, for example the Lin-Kernighan heuristic. What also helps is a better starting solution, e.g. a greedy algorithm.
Example 2. Knapsack
The knapsack problem is a classic one, but for those who don't know it, here follows an explanation.
Imagine you are in a cave full of beautiful treasures. Due to some unforeseen circumstances the cave is collapsing. You have time to fill your knapsack with treasures and then you need to run away to safety. Of course, you want to take the items with you that together bring most value. What items should you take?

The data you need to have for solving this problem is the capacity of the knapsack, the capacity needed for the items and the value of the items.
Below the code that defines this problem:
import copy
import random
import numpy as np
from typing import List
class Knapsack():
def __init__(self, knapsack_capacity: int, n_items: int = 20, item_values: list = None, item_capacities: list = None):
self.name = 'knapsack'
self.knapsack_capacity = knapsack_capacity
if item_values is None and item_capacities is None:
item_values, item_capacities = self._create_sample_data(n_items)
self.item_values = item_values
self.item_capacities = item_capacities
self.n_items = len(item_values)
def baseline_solution(self) -> list:
# select random items until the knapsack is full
capacity = 0
solution = []
while True:
selected = random.choice([i for i in range(self.n_items) if i not in solution])
if capacity + self.item_capacities[selected] > self.knapsack_capacity:
break
else:
solution.append(selected)
capacity += self.item_capacities[selected]
return solution
def score_solution(self, solution: list) -> int:
# count the total value of this solution
return -1 * sum([self.item_values[i] for i in solution])
def select_neighbor(self, solution: list) -> list:
# local move: remove / add / swap items
solution_capacity = sum([self.item_capacities[i] for i in solution])
possible_to_add = [i for i in range(self.n_items) if self.item_capacities[i] <= self.knapsack_capacity - solution_capacity and i not in solution]
if len(solution) == 0:
move = 'add'
elif len(possible_to_add) > 0:
move = np.random.choice(['remove', 'add', 'swap'], p=[0.1, 0.6, 0.3])
else:
move = np.random.choice(['remove', 'swap'], p=[0.4, 0.6])
while True:
if move == 'remove':
solution.pop(random.randrange(len(solution)))
return solution
elif move == 'add':
new_solution = copy.deepcopy(solution)
new_item = random.choice(possible_to_add)
new_solution.append(new_item)
return new_solution
elif move == 'swap':
n = 0
while n < 50:
new_solution = copy.deepcopy(solution)
in_item = random.choice([i for i in range(self.n_items) if i not in solution])
out_item = random.choice(range(len(solution)))
new_solution.pop(out_item)
new_solution.append(in_item)
n += 1
if self._is_feasible(new_solution):
return new_solution
move = 'remove'
def _create_sample_data(self, n_items: int) -> List[list]:
item_values = random.sample(range(2, 1000), n_items)
item_capacities = random.sample(range(1, self.knapsack_capacity), n_items)
return item_values, item_capacities
def _is_feasible(self, solution: list) -> bool:
return sum([self.item_capacities[i] for i in solution]) <= self.knapsack_capacity
The baseline solution selects an item at random until the knapsack is full. The solution score is the sum of values of the items in the knapsack, multiplied by -1. This is necessary because the SA solver minimizes a given objective. In this situation, there are three local moves possible: adding an item, removing an item or swapping two items. This makes it possible to reach every solution possible in solution space. If we swap an item, we need to check if the new solution is feasible.
In the next image you can see a sample run log file. There are 10 items we need to choose from. On top the item values, below the capacity the items take, and on the third line the value densities (item value divided by item capacity). Then the solution process starts. The solution contains the index number(s) of the selected items. In the final solution, items 4, 5 and 8 are selected (counting starts at 0):

Example 3. Rastrigin Function
A function that is used often to test optimization algorithms, is the Rastrigin function. In 3D it looks like this:

It has many local optima. The goal is to find the global minimum, which is at coordinate (0, 0). It is easier to see in a contour plot:

The landscape consists of many local optima with the highest ones in the four corners and the lowest ones in the center.
We can try to find the global minimum with simulated annealing. This problem is continuous instead of discrete, and we want to find the values for x and y that minimize the Rastrigin function.
The Rastrigin function is defined with a n-dimensional domain as:

Let's try to find the optimum for the function with three dimensions (x, y, and z). The domain is defined by x and y, so the problem is exactly as the plots above.
from collections import Counter
import numpy as np
import random
from typing import List
class Rastrigin():
def __init__(self, n_dims: int = 2):
self.name = 'rastrigin'
self.n_dims = n_dims
def baseline_solution(self) -> list:
solution = [random.uniform(-5.12, 5.12) for _ in range(self.n_dims)]
return solution
def score_solution(self, solution: list) -> float:
score = self.n_dims * 10 + sum([(x**2 - 10*np.cos(2*np.pi*x)) for x in solution])
return score
def select_neighbor(self, solution: list, step_size: float = 0.1) -> list:
perturbation = step_size * np.random.randn(self.n_dims)
neighbor = solution + perturbation
while not self._is_feasible(neighbor):
perturbation = step_size * np.random.randn(self.n_dims)
neighbor = solution + perturbation
return neighbor
def _is_feasible(self, solution: list) -> bool:
return bool([x >= -5.12 and x <= 5.12 for x in solution])
For the baseline solution, we select a random float for x and y between -5.12 and 5.12. The score of the solution is equal to z (the outcome of the Rastrigin function). A neighbor is selected by taking a step into a random direction with a step size set to 0.1. The feasibility check is done to make sure we stay in the domain.
A log of a run:

The final solution comes really close to the optimum.
But watch out, if you run the algorithm with more dimensions, it's not guaranteed that you find the optimum:

As you can see, the final solution is a local optimum instead of the global one. It find goods coordinates for the first two variables, but the third one is equal to 0.985, which is far away from 0. It's important to verify the results you get. This specific example will work well by finetuning the SA parameters, but for more dimensions you should probably use another optimization technique that performs better.
Conclusion
In this post, we coded the implementation of simulated annealing. With three examples, you have an understanding of different possibilities with it. You only need to implement three methods for a new problem to make it work, these are baseline_solution
, score_solution
and select_neighbor
. Of course, this implementation is a basic one, and if you want to use it you need to tune the parameters of SA and make sure that the algorithm returns a feasible solution. What can help a lot in final solution quality is a good initial solution and investigating the best way to select a neighbor solution.
Thanks for reading, and see you next time!
Related
An Introduction to a Powerful Optimization Technique: Simulated Annealing
Mathematical Optimization Heuristics Every Data Scientist Should Know
Five ways to combine Mathematical Optimization and Machine Learning