Packaging

Questions

  • How to organize Python projects larger than one script?

  • What is a good file and folder structure for Python projects?

  • How can you make your Python functions most usable by your collaborators?

  • How to prepare your code to make a Python package?

  • How to publish your Python package?

Objectives

  • Learn to identify the components of a Python package

  • Learn to create a Python package

  • Learn to publish a Python package

Organizing Python projects

Python projects often start as a single script or Jupyter notebook but they can grow out of a single file.

In the Scripts episode we have also learned how to import functions and objects from other Python files (modules). Now we will take it a step further.

Recommendations:

To have a concrete but still simple example, we will create a project consisting of 3 functions, each in its own file. We can then imagine that each file would contain many more functions. To make it more interesting, one of these functions will depend on an external library: scipy.

These are the 3 files:

adding.py
def add(x, y):
    return x + y
subtracting.py
def subtract(x, y):
    return x - y
integrating.py
from scipy import integrate


def integral(function, lower_limit, upper_limit):
    return integrate.quad(function, lower_limit, upper_limit)

We will add a fourth file:

__init__.py
"""
Example calculator package.
"""

from .adding import add
from .subtracting import subtract
from .integrating import integral

__version__ = "0.1.0"

This __init__.py file will be the interface of our package/library. It also holds the package docstring and the version string. Note how it imports functions from the various modules using relative imports (with the dot).

This is how we will arrange the files in the project folder/repository:

project-folder
├── calculator
│   ├── adding.py
│   ├── __init__.py
│   ├── integrating.py
│   └── subtracting.py
├── LICENSE
└── README.md

Now we are ready to test the package. For this we need to be in the “root” folder, what we have called the project-folder. We also need to have scipy available in our environment:

from calculator import add, subtract, integral

print("2 + 3 =", add(2, 3))
print("2 - 3 =", subtract(2, 3))
integral_x_squared, error = integral(lambda x: x * x, 0.0, 1.0)
print(f"{integral_x_squared = }")

The package is not yet pip-installable, though. We will make this possible in the next section.

Testing a local pip install

To make our example package pip-installable we need to add one more file:

project-folder
├── calculator
│   ├── adding.py
│   ├── __init__.py
│   ├── integrating.py
│   └── subtracting.py
├── LICENSE
├── README.md
└── pyproject.toml

This is how pyproject.toml looks:

pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "calculator-myname"
description = "A small example package"
version = "0.1.0"
readme = "README.md"
authors = [
    { name = "Firstname Lastname", email = "firstname.lastname@example.org" }
]
dependencies = [
    "scipy"
]

Note how our package requires scipy and we decided to not pin the version here (see Version pinning for package creators).

Now we have all the building blocks to test a local pip install. This is a good test before trying to upload a package to PyPI or test-PyPI (see PyPI (The Python Package Index) and conda ecosystem)

Note

Sometime you need to rely on unreleased, development versions as dependencies and this is also possible. For example, to use the latest xarray you could add:

dependencies = [
     "scipy",
     "xarray @ https://github.com/pydata/xarray/archive/main.zip"
]

See also

Exercise 1

Packaging-1

To test a local pip install:

  • Create a new folder outside of our example project

  • Create a new virtual environment (Dependency management)

  • Install the example package from the project folder into the new environment:

    pip install --editable /path/to/project-folder/
    
  • Test the local installation:

from calculator import add, subtract, integral

print("2 + 3 =", add(2, 3))
print("2 - 3 =", subtract(2, 3))
integral_x_squared, error = integral(lambda x: x * x, 0.0, 1.0)
print(f"{integral_x_squared = }")
  • Make a change in the subtract function above such that it always returns a float return float(x - y).

  • Open a new Python console and test the following lines. Compare it with the previous output.

from calculator import subtract

print("2 - 3 =", subtract(2, 3))

Sharing packages via PyPI

Demo

Most people will watch and observe this, due to the speed with which we will move.

Once we are able to pip-install the example package locally, we are ready for upload.

We exercise by uploading to test-PyPI, not the real PyPI, so that if we mess things up, nothing bad happens.

We need two more things:

  • We will do this using Twine so you need to pip install that, too.

  • You need an account on test-PyPI

Let’s try it out. First we create the distribution package:

$ python3 -m build

We need twine:

$ pip install twine

And use twine to upload the distribution files to test-PyPI:

$ twine upload -r testpypi dist/*

Uploading distributions to https://test.pypi.org/legacy/
Enter your API token:

Note

To generate an API token, proceed to the Create API token page in test-PyPI. You will be prompted for your password.

  1. Under Token name write something memorable. It should remind you the purpose or the name of the computer, such that when you are done using it, you can safely delete it.

  2. Under Scope select Entire account (all projects).

  3. Click on Create token.

  4. Click on Copy token once a long string which starts with pypi- is generated.

Paste that token back into the terminal where twine upload ... is running and press ENTER.

Once this is done, create yet another virtual environment and try to install from test-PyPI (adapt myname).

 $ python3 -m venv venv-calculator
 $ source venv-calculator/bin/activate
 $ which python
 $ python3 -m pip install \
     -i https://test.pypi.org/simple/ \
     --extra-index-url https://pypi.org/simple/ \
     calculator-myname
 $ deactivate

Tools that simplify sharing via PyPI

The solution that we have used to create the example package (using setuptools and twine) is not the only approach. There are many ways to achieve this and we avoided going into too many details and comparisons to not confuse too much. If you web-search this, you will also see that recently the trend goes towards using pyproject.toml as more general alternative to the previous setup.py.

There are at least two tools which try to make the packaging and PyPI interaction easier:

If you upload packages to PyPI or test PyPI often you can create an API token and save it in the .pypirc file.

Building a conda package and share it

Prerequisites

To generate a conda build recipe, the package grayskull and to build it, the package conda-build are required. You may install these with Anaconda Navigator or from the command line:

$ conda install -n base grayskull conda-build

The simplest way for creating a conda package for your python script is to first publish it in PyPI following the steps explained above.

Building a python package with grayskull and conda-build

Once build, the conda package can be installed locally. For this example, we will use runtest. runtest is a numerically tolerant end-to-end test library for research software.

  1. Generate the recipe by executing (grayskull or conda grayskull):

    $ conda grayskull pypi runtest
    

    The command above will create a new folder called runtest containing a file meta.yaml, the conda recipe for building the runtest package.

  2. View the contents of meta.yaml and ensure requirements :

    requirements:
      host:
        - python
        - flit-core >=2,<4
        - pip
      run:
        - python
    

    In the requirements above, we specified what is required for the host and for running the package.

    Remark

    For pure python recipes, this is all you need for building a python package with conda. If your package needs to be built (for instance compilation), you would need additional files e.g. build.sh (to build on Linux/Mac-OSX) and bld.bat (to build on Windows systems). You can also add test scripts for testing your package. See documentation

  3. Build your package with conda

    Your package is now ready to be build with conda:

    $ conda build runtest
    

    Conda package location

    Look at the messages produced while building. The location of the local conda package is given (search for anaconda upload):

    /home/username/miniforge3/conda-bld/noarch/runtest-2.3.4-py_0.tar.bz2
    

    The prefix /home/username/miniforge3/ may be different on your machine. depending on your operating system (Linux, Mac-OSX or Windows). The sub-folder is named noarch since it is a pure-python package and the recipe indicates the same.

    If package contained compiled code then the sub-folder would have been named win-64 or linux-64. It could then be converted to other platforms using conda convert.

  4. Check within new environment

    It is not necessary to create a new conda environment to install it but as explained in previous episode, it is good practice to have isolated environments.

    $ conda create -n local-runtest --use-local runtest
    

    We can then check runtest has been successfully installed in local-runtest conda environment. Open a new Terminal with local-runtest environment (either from the command line:

    $ conda activate local-runtest
    

    or via Anaconda Navigator (Open Terminal), import runtest and check its version:

    import runtest
    print(runtest.__version__)
    

Building a conda package from scratch

It is possible to build a conda package from scratch without using conda grayskull. We recommend you to check the conda-build documentation for more information.

To be able to share and install your local conda package anywhere (on other platforms), you would need to upload it to a conda channel (see below).

Publishing a python package

  • Upload your package to conda-forge: conda-forge is a conda channel: it contains community-led collection of recipes, build infrastructure and distributions for the conda package manager. Anyone can publish conda packages to conda-forge if certain guidelines are respected.

  • Upload your package to bioconda: bioconda is a very popular channel for the conda package manager specializing in bioinformatics software. As for conda-forge, you need to follow their guidelines when building conda recipes.

You can also create your own conda channel for publishing your packages.

Keypoints

  • It is worth it to organize your code for publishing, even if only you are using it.

  • PyPI is a place for Python packages

  • conda is similar but is not limited to Python