Creating a Python library
The following document provides several step to create a python library. This guide aims at showing the good practices in order to publish a python library compliying with nowadays standard.
Name and general information
The library is designed for research purpose and contain all the necessary function in order to implement the Xl-Sindy framework.
Library Name
The library is called py-xl-sindy. While conveying a clear and concise name, this library can be imported through :
1 |
|
Docs
One key component of a successful library is the need for a clear documentation. The documentation fo py-xl-sindy will be generated using sphinx hosted into the GithubPages of the repository. Using this framework enables a widely available documentation for any future user.
Spreading
The package will be released through PyPi. This means that a cautious releasing plan need to be elaborated. The classical main
release
develop
git branches plan will be used, in order to keep a clear track of any work done on the library.
Summary of git strategy
In the following part, an overview of good practice for managing git repository will be given.
The three main branch for a git repository are the following : (four are usually used in bigger project)
main
: Stable, production-ready branch.develop
: Branch where active development happens, contains code not yet released.Feature branches (feature/*
): Short-lived branches for specific features or fixes.- Release branches (
release/*
): Created just before a release for final testing and version adjustments, merged back into both main and develop.
This behavior integrates with GitHub Actions to automate updates to the documentation and deployment to PyPi.
Git action
- When work is done under
develop
:
1 |
|
The commit can be created like usual and developper can test the library in editable mode :
1 |
|
Editable mode is a really usufull feature, it symlink the library with global package from the python environment (or virtual environment). Any change to the library code is immediately reflected in the Python environment. Tips for Editable mode
- At the end of the work on the develop branch, it is time to create the
release/vX.Y.Z
branch :
1 |
|
Work can be finalized (edit the pyproject.toml
or setup.py
version ). It is also the time where test code can be executed (unfortunately at the time of the tutorial py-xl-sindy doesn’t have test code)
1 |
|
Finally developper can merge the code into the main
branch :
1 |
|
Three subsequent tasks need to be done :
- Tag the
main
branch in order to trigger github action into releasing to PyPi :
1 |
|
- Merge again
release/vX.Y.Z
intodevelop
:
1 |
|
- Ddelete
release/vX.Y.Z
:
1 |
|
This paragraph mention the X.Y.Z paradigm, it is nice to recall the bases of this type of versionning.
Github action
Github actions are scripts that launches automatically after an event (commit, tag, …). Usually they are used in repository in order to trigger automatically the release behavior : Compilation of binary, update of the documentation, Announce on social network,…
The following GitHub action directives are used in this library:
- Export a library through Pypi.
- Export the static web pages in GitHub Pages thanks to Sphinx.
During the development of the script for releasing Sphinx and the library on Pypi, GitHub action unveiled a really profond complexity. To not make this tutorial bigger the section about GitHub action has been written in this other article : Make a Pypi and Sphinx Publication GitHub Action.
Repositery folder structure
This library will follow the standard folder structure for a Python library.
1 |
|
Requirement
This library has been designed in order to be released on Pypi and becomming accessible to anyone. In this sense requirement file and project definition file need to be correctly managed.
Listed from the different .py
script, the library will need the following dependencies:
- Necessary :
sympy
: for symbolic expression managementnumpy
: for any matrix and math manipulationscipy
: for RK45 method and interpolation methodsklearn
: for linear regression model Lasso
- Optional :
matplotlib
: for rendering of exemple (maybe not necessary)
An optional requirement can be elaborated for the Optional package. However at the time of the writing of this guide, it has been decided to include the Optional package in order to get the full set of feature provided by the library.
Pyproject.toml requirement
So the pyproject.toml
file list everything we need for development and production purpose, in addition with project description.
At the time of this article the pyproject.toml
dependencies listing lis the following :
1 |
|
Any developper of the library can freely call the following command in order to install in editable mode the library :
1 |
|
Library DocString
When creating a library, it is really important to document every function. One the easiest way is to write “DocString” directly on each function. Multiple “Docstring” flavour are available, for this library the Google-docstring type has been choosen due to its ease of use.
The Google-docstring use markdown and incorporate all the feature needed for writing a clear documentation of python function.
In addition with DocString, developer can add Typing information in order to give to the user the input and output required by the function. Typing is not initally required by python (unlike C flavor code) but adding this information streamline the utilisation of a library.
Google docstring exemple
here follows an simple exemple of google docstring applied on a class. Everything is quite explanatory so no further detail will be given.
1 |
|
Sphinx the doc creator
This section detail the main step of creating a python library documentation using the Sphinx package.
First step is to create the base of the Sphinx documentation. After having installed the package one’s need to run :
1 |
|
Sphinx will prompt a short-tutorial in order to get the specification of the project.
The next step after this tutorial is to modify conf.py
in order to account for :
- working directory : add this
1 |
|
- autodoc extension name : insert this
1 |
|
- (optionnal) theme change : replace by this (to install theme you need to install optionnal python package)
1 |
|
After these configuration step, the autodoc file can be generated automatically using the apidoc command :
1 |
|
The autodoc files will parse every docstring from the python file during the generation of the Sphinx documentation.
The different tags do the following :
--no-toc
: Enable little library to skip the parent Table of Content--separate
: Separate every module in their own file (better readability)--no-headings
: Doesn’t provide any heading, title can be given directly by the doc string at the beginning of our file #Google-docstring-exemple However break the tree at the module level.--force
: Force overwrite every data made by the same command (modify every .rst file excluding index.rst)
After generating the autodoc file for every python module, the whole website can be generate through the make command.
1 |
|
Library testing
For testing a library during all the development it is common to use a second package emitter for test purpose.
It has been decided to use TestPyPi in order to test the library. The deyploment to TestPyPi is automatic for every new version push to the main
branch of the repository thanks to GitHub action workflow .
Install from a test package provider
The install is a little bit different from a normal pip command because we need to specify where our test package provider is located :
1 |
|
the -i
flag specifies the address of the test package provider and the --extra-index-url
specifies the fallback for dependencies.
Library formatting
A wide range of tool are available to format a library. The most common one is called black
and enable automatic formatting. On the py-xl-library
, I have decided to add black
in the dev dependencies
in order to check the package each time before release.
to automatically format a file it is really easy. only one command is necessary :
1 |
|
Actual work done
After a pause of three months from this section of the project, it was necessary to get used again with the codebase. This process highlighted the importance of precision and rigor in type annotations and docstring documentation to ensure clarity and maintainability.
Revisiting the code began with addressing the most fundamental functions. While these functions appear simple, they posed significant challenges in formatting due to their role in establishing the foundational formalism for more complex functionality.
Additionally, initial attempts at Git management revealed its complexity. Managing the first commit, in particular, required considerable effort and posed unexpected challenges.
Code work
Let’s list some task done on the code :
- Rename every function following PEP8 formatting.
- Added docstring to every function.
- Added clear typing to every function.
- Clear explanation in the exemple.
- Added
README.md
and same inside the Sphinx documentation. - Clear unused function.