Stop Manually Sorting Your List In Python If Performance Is Concerned

At least for myself, most of the time when I code using Python, I am dealing with collection types. That includes the native Python List, Set, and Dictionary, as well as some third-party collection types such as Numpy Array and Tensorflow tensors. The latter are usually much faster because they are implemented using CPython or whatever C-related extensions.
However, sometimes it is too overwhelming to use them. Especially when we are developing an application that is irrelevant to Data Science, these libraries might be bulky to be involved just for better performance.
In this article, I'll introduce a library called Sorted Containers, which provides the best practice when a sorted collection type is needed. I'll cover the two major types from this library: Sorted List and Sorted Set. I will give some real-world examples to make it easier to understand.
Before everything, make sure you've installed the library using pip
.
pip install sortedcontainers
1. Sorted List – Real-Time Leaderboards

1.1 Plain Python Code Implementation
As mentioned, I'll use some real use cases to demonstrate the library rather than repeat the documentation.
Suppose we are developing a process that maintains a list of users with their scores. Whatever the score is about, it could be a game or something else.
Let's implement this feature without the Sorted Container to see what it looks like. Before everything, we need to create an existing list with names and scores in tuples.
Python">leaderboard = [('Alice', 50), ('Bob', 40), ('Chris', 60)]
Now, let's write a function to add a new name and score.
def add_score_plain(leaderboard, player, score):
leaderboard.append((player, score))
leaderboard.sort(key=lambda x: x[1], reverse=True)
The first line just appends the tuple to the list. However, we need to make sure the list is sorted based on the scores. The score is inside the tuples, so the easiest way to sort the list is to use a lambda function. In the lambda function, we sort the list based on the second element, and it should be in descending order, so reverse=True
.
Now, we need to output the leaderboard in a pretty format. The function below will do it.
def print_leaderboard_plain(leaderboard):
print("Leaderboard:")
for rank, (player, score) in enumerate(leaderboard, start=1):
print(f"{rank}. {player}: {score} points")
The little trick here is using the enumerate
to get the rank number. However, the leaderboard should start from 1 rather than 0, so we can set start=1
.
Then, let's add a new name and score and test the output as follows.
# Test code
add_score_plain(leaderboard, 'David', 55)
print_leaderboard_plain(leaderboard)

1.2 Using Sorted List
Now, let's have a look at how to implement the same features using Sorted Container. Specifically, we need to use the SortedList
module in this library.
Before we start, we need to understand the characteristics of the SortedList
module. Let's call the instance of SortedList
a "sorted list" to be easier.
- The sorted list will automatically sort all its items, including the one that is just added. However, the order is ascending.
- The sorting criteria are intuitive. It can be numbers or ASCII code of the string text. When the item is a tuple, only the first element will be considered.
Based on the above characteristics, we need to design our leaderboard in the following way.
from sortedcontainers import SortedList
leaderboard = SortedList([(-50, 'Alice'), (-40, 'Bob'), (-60, 'Chris')])
The above code will import the SortedList
and use it to initialise the sorted list.
To use the sorted list's characteristics, we need to put the score on the left. So, it will be used as the criteria for sorting the list. Also, we intentionally put a negative sign on the scores, so they will be sorted in a "descending" order. Of course, to make sure the future items can be added correctly, the add_score_sorted()
method needs to be as follows.
def add_score_sorted(leaderboard, player, score):
leaderboard.add((-score, player))
Please note that we don't need to add another step to sort the list as we did in the plain Python version, as the sorted list will guarantee that the items are in a sorted state.
Then, the print_leaderboard_sorted()
method also needs to take the negative sign into account, so the output will be reasonable and readable.
def print_leaderboard_sorted(leaderboard):
print("Leaderboard:")
for rank, (score, player) in enumerate(leaderboard, start=1):
print(f"{rank}. {player}: {-score} points")
The test code will be exactly the same as the plain Python version. Here is the output, which is also identical. No problem.

1.3 Evaluation of the Benefits
You may ask, what are the exact benefits of doing this? Indeed, the code of the sorted list version is one line shorter than the plain version, but it's not convincible as a sole benefit. Also, the trade-off in the sorted list version, such as the negative scores kind of reduced the readability.
In fact, one of the other major benefits is the performance. Let me show that.
Let's import the time
module in Python natively. Then, we can add some parameters for the performance tests.
import time
from sortedcontainers import SortedList
# Performance test parameters
random_scores = [(f"Player{i}", i % 10000) for i in range(10000)]
In the above code, we generated 10,000 random scores that will be inserted into the leaderboard list later.
Now, let's put the plain version and the sorted list version together, and capture their elapsed time when adding a new item.
# Plain Python List test
plain_leaderboard = [('Alice', 50), ('Bob', 40), ('Charlie', 60)]
start_time = time.time()
for player, score in random_scores:
add_score_plain(plain_leaderboard, player, score)
plain_time = time.time() - start_time
# SortedList test
sorted_leaderboard = SortedList([(-50, 'Alice'), (-40, 'Bob'), (-60, 'Charlie')])
start_time = time.time()
for player, score in random_scores:
add_score_sorted(sorted_leaderboard, player, score)
sorted_time = time.time() - start_time
The above code initialises the plain list and sorted list separately, then uses a for loop to add the items to the list.
Now, let's output the elapsed time as follows.
print(f"Plain list time: {plain_time:.4f} seconds")
print(f"SortedList time: {sorted_time:.4f} seconds")

It shows that the performance of a sorted list is 100x better than the plain code. In fact, we only tested ten thousand random scores. If there are more elements to be sorted, the difference in the performance will be even higher.
2. Sorted Set – Event Log Management System

Now, let's look at another example and leverage this example to see another important feature of the Sorted Container library. That is the Sorted Set.
Suppose we are developing an Event Log Management System. We have many timestamps to manage. Sorted Set is kind of similar to the Python Set. For example, it can contain unique values as well. However, the major difference is that it is sorted.
2.1 It's like a Python Set but not quite
Let's initialise a sorted set with some timestamps.
from sortedcontainers import SortedSet
event_timestamps = SortedSet([
"2024-08-19 10:00:00",
"2024-08-19 11:00:00",
"2024-08-19 09:30:00"
])
print(event_timestamps)

Please note that the output shows that the timestamps are sorted automatically, although we didn't initialise it in a sorted way.
Now, let's try to add two new timestamps.
event_timestamps.add("2024-08-19 09:30:00") # Duplicate, will be ignored
event_timestamps.add("2024-08-19 10:30:00") # New event, will be added
print(event_timestamps)

It can be seen that the duplicated timestamp was ignored, while the new timestamp was added as expected. Of course, the new timestamp is also in the right order.
2.2 Usage of the Sorted Set
We have already initialised a sorted set and demonstrated its characteristics. Now, let's have a look at some other convenient features that we can use.
Suppose we want to the next event timestamp after a given timestamp. Rather than compare the timestamp with every single element, we can use the bisect_right()
method from the sorted list. Here is the implementation of the method to get the next event.
def get_next_event(event_timestamps, current_time):
index = event_timestamps.bisect_right(current_time)
if index < len(event_timestamps):
return event_timestamps[index]
else:
return None
print("Next event after 10:25 is", get_next_event(event_timestamps, "2024-08-19 10:25:00"))
In the above code, we try to get the index of the target event timestamp that we are looking for. If the index is valid, we return the timestamp. Otherwise, return none.
Please note that the time stamp 2024–08–19 10:25:00
does not exist in the sorted set. The sorted set can handle such a scenario and give us the expected result without any issues.

Similarly, there is another bisect_left()
method to get the item before the given value.
Apart from that, the sorted set also allows us to get a range of items between two boundaries. That is achieved by the irange()
method.
events = event_timestamps.irange("2024-08-19 09:55:00", "2024-08-19 10:30:00")
print("Events between 09:30 and 11:00 are", list(events))
In the above code, we give the start time and end time as arguments. The irange()
method will return us a Python Iterator of the events we need. This is a very clever design because an iterator is a "lazy" data structure that will not consume much resource before we try to populate items from it.
In this demonstration, we can convert it into a list so that we can verify the results easily.

Summary

In this article, I have introduced a library called Sorted Containers. It is one of the most popular Python modules that handles data structures effectively. It offers way better performance than the Python native solution in some scenarios where we use Python List or other built-in collection types. Also, it unlocks some new ways to manage and interact with the items in a collection-type object. I hope this may help you during software/application development.