
A collection of data processing Spark functions for the use in Statistics Norway.

pip install ssb-spark-tools==0.1.12


SSB Spark Tools

A collection of data processing Spark functions for the use in Statistics Norway (SSB)

PyPI version Status License

The SSB Spark Tools Library is a colection of Data processing functions for the use in Data processing in Statistics Norway


pip install ssb-spark-tools

Development setup

This repo uses poetry for dependency management and publishing to PyPi. Install poetry as described on the poetry install page.

poetry install                 Install required tools for build/dev
poetry run pytest              Run tests
poetry build                   Build dist
poetry publish                 Publish to PyPi


Run tests for all python distributions using GitHub Actions, see


Prerequisites: You will need to register accounts on PyPI and TestPyPI.

Before releasing:

  • Make sure you're working on a "new" version number.
  • Make sure to update release notes.
  • Make sure the GitHub repo has a secret with the name PYPI_API_TOKEN and contains the PyPi access token.

To release and publish a new version to PyPI:

  • Create a new release in the GitHub repo.
  • The Upload Python Package GitHub Action will start and publish the new version to PyPi.


poetry publish

For a dress rehearsal, you can do a test release to the TestPyPI index. TestPyPI is very useful, as you can try all the steps of publishing a package without any consequences if you mess up. Read more about TestPyPI here.

You should see the new release appearing here (it might take a couple of minutes for the index to update).

Release History

  • 0.0.1
    • Initial version with functions as in use on initiaition


Statistics Norway –

Distributed under the MIT license. See LICENSE for more information.


  1. Fork it (
  2. Create your feature branch (git checkout -b feature/fooBar)
  3. Commit your changes (git commit -am 'Add some fooBar')
  4. Push to the branch (git push origin feature/fooBar)
  5. Create a new Pull Request