Packaging

Questions

  • How to organize Python projects larger than one script?

  • What is a good file and folder structure for Python projects?

  • How can you make your Python functions most usable by your collaborators?

  • How to prepare your code to make a Python package?

  • How to publish your Python package?

Objectives

  • Learn to identify the components of a Python package

  • Learn to create a Python package

  • Learn to publish a Python package

Organizing Python projects

Python projects often start as a single script or Jupyter notebook but they can grow out of a single file.

In the Scripts episode we have also learned how to import functions and objects from other Python files (modules). Now we will take it a step further.

Recommendations:

To have a concrete but still simple example, we will create a project consisting of 3 functions, each in its own file. We can then imagine that each file would contain many more functions. To make it more interesting, one of these functions will depend on an external library: scipy.

These are the 3 files:

adding.py
def add(x, y):
    return x + y
subtracting.py
def subtract(x, y):
    return x - y
integrating.py
from scipy import integrate


def integral(function, lower_limit, upper_limit):
    return integrate.quad(function, lower_limit, upper_limit)

We will add a fourth file:

__init__.py
"""
Example calculator package.
"""

from .adding import add
from .subtracting import subtract
from .integrating import integral

__version__ = "0.1.0"

This __init__.py file will be the interface of our package/library. It also holds the package docstring and the version string. Note how it imports functions from the various modules using relative imports (with the dot).

This is how we will arrange the files in the project folder/repository:

project-folder
├── calculator
│   ├── adding.py
│   ├── __init__.py
│   ├── integrating.py
│   └── subtracting.py
├── LICENSE
└── README.md

Now we are ready to test the package. For this we need to be in the “root” folder, what we have called the project-folder. We also need to have scipy available in our environment:

from calculator import add, subtract, integral

print(add(2, 3))
print(subtract(2, 3))
print(integral(lambda x: x * x, 0.0, 1.0))

The package is not yet pip-installable, though. We will make this possible in the next section.

Testing a local pip install

To make our example package pip-installable we need to add one more file:

project-folder
├── calculator
│   ├── adding.py
│   ├── __init__.py
│   ├── integrating.py
│   └── subtracting.py
├── LICENSE
├── README.md
└── pyproject.toml

This is how pyproject.toml looks:

pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "calculator-myname"
description = "A small example package"
version = "0.1.0"
readme = "README.md"
authors = [
    { name = "Firstname Lastname", email = "firstname.lastname@example.org" }
]
dependencies = [
    "scipy"
]

Note how our package requires scipy and we decided to not pin the version here (see Version pinning for package creators).

Now we have all the building blocks to test a local pip install. This is a good test before trying to upload a package to PyPI or test-PyPI (see PyPI (The Python Package Index) and (Ana)conda)

Exercises 1

Packaging-1

To test a local pip install:

  • Create a new folder outside of our example project

  • Create a new virtual environment (Dependency management)

  • Install the example package from the project folder into the new environment: $ pip install /path/to/project-folder/

  • Test the local installation:

from calculator import add, subtract, integral

print(add(2, 3))
print(subtract(2, 3))
print(integral(lambda x: x * x, 0.0, 1.0))

Sharing packages via PyPI

Once we are able to pip-install the example package locally, we are ready for upload.

We exercise by uploading to test-PyPI, not the real PyPI, so that if we mess things up, nothing bad happens.

We need two more things:

  • We will do this using Twine so you need to pip install that, too.

  • You need an account on test-PyPI.

Let’s try it out. First we create the distribution package:

$ python3 -m build

We need twine:

$ pip install twine

And use twine to upload the distribution files to test-PyPI:

$ twine upload -r testpypi dist/*

Uploading distributions to https://test.pypi.org/legacy/
Enter your username:
Enter your password:

Once this is done, create yet another virtual environment and try to install from test-PyPI (adapt “myname”):

$ pip install -i https://test.pypi.org/simple/ calculator-myname

Tools that simplify sharing via PyPI

The solution that we have used to create the example package (using setuptools and twine) is not the only approach. There are many ways to achieve this and we avoided going into too many details and comparisons to not confuse too much. If you web-search this, you will also see that recently the trend goes towards using pyproject.toml as more general alternative to the previous setup.py.

There are at least two tools which try to make the packaging and PyPI interaction easier:

Building a conda package and share it

Demo

Most people will watch and observe this, due to speed which we will move.

Prerequisites

To create a conda package, conda-build package is required. You may install it with Anaconda Navigator or from the command line:

$ conda install conda-build

The simplest way for creating a conda package for your python script is to first publish it in PyPI following the steps explained above.

Building a python package with conda skeleton pypi

Once build, the conda package can be installed locally. For this example, we will use runtest. runtest is a numerically tolerant end-to-end test library for research software.

  1. Create pypi skeleton:

    $ conda skeleton pypi runtest
    

    The command above will create a new folder called runtest containing a file meta.yaml, the conda recipe for runtest.

  2. Edit meta.yaml and update requirements:

    requirements:
      host:
        - pip
        - python
        - flit
      run:
        - python
        - flit
    

    In the requirements above, we specified what is required for the host and for running the package.

    Remark

    For pure python recipes, this is all you need for building a python package with conda. If your package needs to be built (for instance compilation), you would need additional files e.g. build.sh (to build on Linux/Mac-OSX) and bld.bat (to build on Windows systems). You can also add test scripts for testing your package. See documentation

  3. Build your package with conda

    Your package is now ready to be build with conda:

    $ conda-build runtest
    

    Conda package location

    Look at the messages produced while building. The location of the local conda package is given (search for anaconda upload):

    ~/anaconda3/conda-bld/win-64/runtest-2.2.1-py38_0.tar.bz2
    

    The prefix ~/anaconda3/ may be different on your machine and depending on your operating system (Linux, Mac-OSX or Windows) the sub-folder win-64 differs too (for instance linux-64 on Linux machines).

    The conda package we have created is specific to your platform (here win-64). It can be converted to other platforms using conda convert.

  4. Check within new environment

    It is not necessary to create a new conda environment to install it but as explained in previous episode, it is good practice to have isolated environments.

    $ conda create -n local-runtest --use-local runtest
    

    We can then check runtest has been successfully installed in local-runtest conda environment. Open a new Terminal with local-runtest environment (either from the command line:

    $ conda activate local-runtest
    

    or via Anaconda Navigator (Open Terminal), import runtest and check its version:

    import runtest
    print(runtest.__version__)
    

Building a conda package from scratch

It is possible to build a conda package from scratch without using conda skeleton. We recommend you to check the conda-build documentation for more information.

To be able to share and install your local conda package anywhere (on other platforms), you would need to upload it to a conda channel (see below).

Publishing a python package

  • Upload your package to Anaconda.org: see instructions here. Please note that you will have to create an account on Anaconda.

  • Upload your package to conda-forge: conda-forge is a conda channel: it contains community-led collection of recipes, build infrastructure and distributions for the conda package manager. Anyone can public conda packages to conda-forge if certain guidelines are respected.

  • Upload your package to bioconda: bioconda is a very popular channel for the conda package manager specializing in bioinformatics software. As for conda-forge, you need to follow their guidelines when building conda recipes.

You can also create your own conda channel for publishing your packages.

Keypoints

  • It is worth it to organize your code for publishing, even if only you are using it.

  • PyPI is a place for Python packages

  • conda is similar but is not limited to Python