Best Practices for Python Development

Author:Murphy  |  View: 25407  |  Time: 2025-03-23 19:52:32

The goal of this post is sharing best practises for Python development – in particular how to set up, use, and manage a Github repository, which adheres to professional industry standards. We will discuss useful tools to keep your code clean and bug-free, show how to set up the repository and include previously introduced tools for automated CI (continuous integration) checks – and finally put it all together in a sample project. Note I am not claiming this list is complete or the only possible way to do these things. However, I want to share my professional experience from working as a software engineer, and can confirm that many large software companies follow a similar pattern.

With that said, let's dive straight into the meaty part – I hope you find this useful! You can find full code for this post here and follow along as we go.

Used Tools

In this section we'll present tools used in this article.

poetry

Poetry is a neat tool to manage Python versions and dependencies. It makes it easy to control and fix a Python version, and manage dependencies in a central way. Of all the ways of doing this, I recommend poetry. I wrote a lengthier introduction about this in another post, but will summarise the gist here.

Core of poetry's dependency management is the pyproject.toml file. For our project, it starts like this:

[tool.poetry]
name = "Sample Python Project"
version = "0.1.0"
description = "Sample Python repository"
authors = ["hermanmichaels <[email protected]>"]

[tool.poetry.dependencies]
python = "3.10"
matplotlib = "3.5.1"
mypy = "0.910"
numpy = "1.22.3"
pytest = "7.1.2"
black = "22.3.0"
flake8 = "4.0.1"
isort = "^5.10.1"

We can see a header defining and exposing certain project properties, followed by a paragraph defining needed dependencies.

As a "user", we just have to execute poetry install in our terminal, and poetry will automatically create a Python environment with all dependencies installed. We can then enter this via poetry shell.

Developers, after adding a new dependency, run poetry update. This will generate or update the poetry.lock file, which you can kind of picture as a binary representation of above specified dependencies. It will need to be added to the repository, too – and above described process of installing requirements actually uses this file.

isort

PEP 8, the styleguide for Python, also defines how to order imports. The recommendation is to create the following groups:

  1. Standard library imports (e.g. os, sys)
  2. Related third party imports (e.g. numpy)
  3. Local, project-specific imports (e.g. different files of the same project)

Inside these groups, imports should be sorted alphabetically.

isort is a tool which removes the necessity of remembering and doing this ourselves. Conveniently, isort and most of the tools presented in the next sections work well with poetry, and we even set their settings in the pyproject.toml file. For our use case, we set the following:

[tool.isort]
profile = "black"
py_version = 310
multi_line_output = 3

In addition to the Python version we tell isort that we will be working with the formatter black (see next section), and define how imports which are too long for a single line will be re-formatted.

black

black is a code formatter for Python. Running it formats the code according to certain conventions. By having all developers use it, we enforce a specific, uniform style to our code. Think about line indents, number of blank lines after functions, etc.

Settings are also managed by poetry, and we simply set:

[tool.black]
line-length = 80
target_version = ["py310"]

I.e. a maximal line length of 80, and the target Python version.

flake8

flake8 is a code linter. Code linters and code formatters are very related, however, linters check the adherence of specific styles and guidelines, but do not format it. flake8 does several things, one is checking against the previously mentioned PEP 8 standard.

mypy

mypy is a static type checker for Python. As you (most likely) know, Python is a dynamically typed language, meaning variable types are inferred at runtime (as opposed to e.g. C++). This flexibility we all love also comes with drawbacks, such as the higher probability of making mistakes, without a compiler or similar to act as a first line of defence. Thus, in recent years many efforts are actually focused on making type checking in Python stricter. mypy is such a type checker, meaning it will check your code, and see if you are using variables correctly. Most of this is automatic, however, but you can also make certain types explicit by annotating them (which is anyways recommended for function parameters and return types, for visibility).

We can annotate function arguments and return types as follows:

def foo(x: int, y: int) -> int:
    return x + y

mypy would then complain if we tried calling the function with wrong arguments, such as:

foo("a", "b")

We manage mypy settings in a separate mypy.ini file. This is mostly needed because some external dependencies cannot be type-checked, and we need to exclude them from being checked (although we can fix some).

pytest

Unit testing is essential for any somewhat professional software project, and recommend for all. We will be using pytest, which is preferred by many Python developers. I wrote a lengthier introduction in another post, with some follow-ups, so I'd like to refer there (or of course any other great tutorial out there!) if you're not familiar with it.

Unit testing helps us catch bugs, and thus keep the quality of written code at a high level.

Github Actions

Github Actions allow automatising and running certain steps in the repository – all in the spirit of continuous integration. With them, we can create workflows to be run for certain events, such as pull requests (PRs). The workflow we will use here is actually an accumulation of above introduced tools – i.e. for every opened PR it will run things like formatting, linting, type checking and unit tests, and we expect all this to pass before merging – thus, protecting our main branch from committing any unclean or faulty code!

Also for this topic I would like to refer to a previous post of mine for an introduction.

Configuring the Repository

This post will not be an introduction to version control systems or setting up Github repositories from scratch. Instead some basic knowledge is expected, and I would refer to any other tutorial out there, such as the official Github one. Here we will only talk about settings in Git which basically any professional software repository will have.

Broadly speaking, this is only one: protecting the main branch. We don't want anybody to push to this without checks, and in particular require two things: approval from other developers, and passing of the CI tests we established. To do so, go to your repository and select "Settings", then "Branches":

Screenshot by author

Then add a branch protection rule for your main branch, and enable:

  • Require a pull request before merging
  • Require approval (you can then select the number of necessary approvals)
  • Require status checks to pass before merging

Putting it All Together

This introduces all needed themes. Now we will put it together, set up a sample repository and show a workflow every developer should follow.

Sample Project

Our sample project will have a folder utils, containing math_utils.py and a related unit test file (math_utils_test.py). In math_utils we will be re-implementing an exponentiation function for demonstrative purposes:

import numpy.typing as npt

def exponentiate(base: int, exponent: npt.NDArray) -> npt.NDArray:
    return base**exponent

Thus, exponentiate(2, [1, 2, 3]) will return [2, 4, 8].

We test the correctness of the function in the test file:

import numpy as np
import numpy.typing as npt
import pytest

from utils.math_utils import exponentiate

@pytest.mark.parametrize(
    "base, exponent, expected",
    [
        (2, np.zeros(3), np.ones(3)),
        (2, np.linspace(1, 4, 4), np.asarray([2, 4, 8, 16])),
    ],
)
def test_exponentiate(base: int, exponent: npt.NDArray, expected: npt.NDArray) -> None:
    assert np.allclose(exponentiate(base, exponent), expected)

In our main file (main.py), we will use this to generate the first 10 powers of 2, and plot this with matplotlib:

import matplotlib.pyplot as plt
import numpy as np

from utils.math_utils import exponentiate

def main() -> None:
    x = np.linspace(0, 10, 10)
    y = exponentiate(2, x)
    plt.plot(x, y, "ro")
    plt.savefig("plot.png")

if __name__ == "__main__":
    main()

The pyproject.toml file for this project looks as follows:

[tool.poetry]
name = "Sample Python Project"
version = "0.1.0"
description = "Sample Python repository"
authors = ["hermanmichaels <[email protected]>"]

[tool.poetry.dependencies]
python = "3.10"
matplotlib = "3.5.1"
mypy = "0.910"
numpy = "1.22.3"
pytest = "7.1.2"
black = "22.3.0"
flake8 = "4.0.1"
isort = "^5.10.1"

[tool.poetry.dev-dependencies]

[tool.black]
line-length = 80
target_version = ["py310"]

[tool.isort]
profile = "black"
py_version = 310
multi_line_output = 3

In addition, we exclude matplotlib from mypy checking to prevent errors by generating the following mypy.ini file:

[mypy]
python_version = 3.10

[mypy-matplotlib.*]
ignore_missing_imports = True
ignore_errors = True

Github Workflow

We then define the following Github Actions workflow:

name: Sample CI Check

on:
  pull_request:
    branches: [main]
  push:
    branches: [main]

permissions:
  contents: read

jobs:
  build:
    runs-on: ubuntu-20.04

    steps:
      - uses: actions/checkout@v3

      - name: Set up Python 3.10.0
        uses: actions/setup-python@v3
        with:
          python-version: "3.10.0"

      - name: Install poetry dependencies
        run: |
          curl -sSL https://install.python-poetry.org | python3 -
          poetry install

      - name: Sort imports with isort
        run: poetry run python -m isort .

      - name: Format with black
        run: poetry run python -m black .

      - name: Lint with flake8
        run: poetry run python -m flake8 .

      - name: Check types with mypy
        run: poetry run python -m mypy .

      - name: Run unit tests
        run: poetry run python -m py.test

Thus, this workflow is run for every new PR, and for every PR merged to main.

It has the following steps:

  • It checks out the repository.
  • It installs Python 3.10.
  • It installs poetry, and installs our dependencies.
  • It then runs all our installed checks (note poetry run X is identical to entering the poetry environment via poetry shell and then executing X). In particular, these are: sort imports via isort, format code with black, lint with flake8, check types with mypy, and run pytest.

Local Developer Workflow

Now we describe the workflow every developer should do occasionally, and especially before raising a PR (sorry about the overloading of "workflow" – in the section above denoting the Github concept of grouping steps in a workflow, whereas here it simply describes a list of steps to execute by the developer).

In essence, we don't want to rely on and stress CI with finding all of our errors, but push PRs as "clean" as possible: this means simply running all steps run on CI by ourselves locally before pushing. This is achieved via:

  • run isort to sort the imports: isort .
  • run black to format the code: black .
  • run flake8 to check the code: python -m flake8
  • run mypy for type checking: mypy . (note this might take quite a while the first time)
  • run all unit tests: python -m pytest

Conclusion

In this post we described useful tools to help manage, organize, and keep Python code in good shape and up to professional standards. We then showed how to set up a Git repository for versioning and sharing this code, and in particular how to use the previously introduced tools in CI: i.e. running certain checks to prevent any unclean or faulty commits to the main branch. Finally, we showed how developers can run all these tools locally first, to minimize the risk of CI failing.

I hope this post will be useful for your future private and professional projects. Let me know if there are any awesome tools you or your company are using, or you feel I missed something. Thanks for reading!

Tags: Getting Started Github Programming Python Software Development

Comment