Boxine - bx_py_utils

Various Python utility functions

Quickstart

pip install bx_py_utils

Existing stuff

Here only a simple list about existing utilities. Please take a look into the sources and tests for deeper informations.

bx_py_utils.anonymize

anonymize() - Anonymize the given string with special handling for eMail addresses and the possibility to truncate the output.
anonymize_dict() - Returns a new dict with anonymized values for keys containing one of the given keywords.

bx_py_utils.auto_doc

FnmatchExclude() - Helper for auto doc exclude_func that exclude files via fnmatch pattern.
assert_readme() - Check and update README file with generate_modules_doc()
assert_readme_block() - Check and update README file: Asset that "text_block" is present between the markers.
generate_modules_doc() - Generate a list of function/class information via pdoc.
get_code_location() - Return start and end line number for an object via inspect.

bx_py_utils.aws.client_side_cert_manager

ClientSideCertManager() - Helper to manage client-side TLS certificate via AWS Secrets Manager by

bx_py_utils.aws.secret_manager

SecretsManager() - Access AWS Secrets Manager values

bx_py_utils.compat

removeprefix() - Backport of removeprefix from PEP-616 (Python 3.9+)
removesuffix() - Backport of removesuffix from PEP-616 (Python 3.9+)

bx_py_utils.dict_utils

compare_dict_values() - Compare two dictionaries if values of the same keys are present and equal.
dict_get() - nested dict get()
dict_list2markdown() - Convert a list of dictionaries into a markdown table.
pluck() - Extract values from a dict, if they are present

bx_py_utils.doc_write

Doc-Write, see: https://github.com/boxine/bx_py_utils/blob/master/bx_py_utils/doc_write/README.md

bx_py_utils.environ

OverrideEnviron() - Context manager to change 'os.environ' temporarily.
cgroup_memory_usage() - Returns the memory usage of the cgroup the Python interpreter is running in.

bx_py_utils.error_handling

print_exc_plus() - Print traceback information with a listing of all the local variables in each frame.

bx_py_utils.file_utils

EmptyFileError() - Will be raised from get_and_assert_file_size() if a 0-bytes file was found.
FileError() - Base error class for all 'file_utils' exceptions.
FileHasher() - Context Manager for generate different hashes from file content while processing a file.
FileSizeError() - File size is not the same as the expected size.
NamedTemporaryFile2() - Generates a temp file with the given filename without any random name sequence.
OverlongFilenameError() - cut_filename() error: The file name can not be shortened, because sterm is to short.
TempFileHasher() - File like context manager that combines NamedTemporaryFile2 and FileHasher.
cut_filename() - Short the file name (and keep the last suffix). Raise OverlongFilenameError if it can't fit.
get_and_assert_file_size() - Check file size of given file object. Raise EmptyFileError for empty files or return size
safe_filename() - Makes an arbitrary input suitable to be used as a filename.

bx_py_utils.filename_matcher

filename_matcher() - Enhance fnmatch that accept a list of patterns.

bx_py_utils.graphql_introspection

introspection_query() - Generate GraphQL introspection query with variable nested depth.

bx_py_utils.hash_utils

collect_hashes() - Get all hash values from a dictionary. Use hashlib.algorithms_available for key names.
compare_hashes() - Compare hashes from two dictionaries. Return DictCompareResult with the results.
url_safe_encode() - Encode bytes into a URL safe string.
url_safe_hash() - Generate a URL safe hash with max_size from given string/bytes.

bx_py_utils.html_utils

ElementsNotFoundError() - Happens if requested HTML elements cannot be found
InvalidHtml() - XMLSyntaxError with better error messages: used in validate_html()
get_html_elements() - Returns the selected HTML elements as string
pretty_format_html() - Pretty format given HTML document via BeautifulSoup (Needs 'beautifulsoup4' package)
validate_html() - Validate a HTML document via XMLParser (Needs 'lxml' package)

bx_py_utils.humanize.pformat

pformat() - Format given object: Try JSON fist and fallback to pformat()

bx_py_utils.humanize.time

human_timedelta() - Converts a time duration into a friendly text representation.

bx_py_utils.import_utils

import_string() - Import a dotted module path and return the attribute/class designated by the last name in the path.

bx_py_utils.iteration

chunk_iterable() - Returns a generator that yields slices of iterable of the given chunk_size.

bx_py_utils.path

ChangeCurrentWorkDir() - Context Manager change the "CWD" to an other directory.
MockCurrentWorkDir() - Context Manager to move the "CWD" to a temp directory.
assert_is_dir() - Check if given path is a directory
assert_is_file() - Check if given path is a file

bx_py_utils.processify

processify() - Decorator to run a function as a process.

bx_py_utils.pyproject_toml

get_pyproject_config() - Get a config section from "pyproject.toml". The path can be optional specify.

bx_py_utils.rison

rison_dumps() - Encode as RISON, a URL-safe encoding format.

bx_py_utils.stack_info

FrameNotFound() - Base class for lookup errors.
last_frame_outside_path() - Returns the stack frame that is the direct successor of given "file_path".

bx_py_utils.string_utils

compare_sentences() - Calculates the Levenshtein distance between text1 and text2. With filter functionality.
ensure_lf() - Replace line endings to unix-style.
get_words() - Extract words from a text. With filter functionality.
is_uuid() - Returns True if text is a valid UUID (https://www.rfc-editor.org/rfc/rfc9562#name-uuid-format).
levenshtein_distance() - Calculates the Levenshtein distance between two strings.
startswith_prefixes() - >>> startswith_prefixes('foobar', prefixes=('foo','bar'))
truncate() - Truncates the given string to the given length
uuid_from_text() - Generate a UUID instance from the given text in a determinism may via SHA224 hash.

bx_py_utils.test_utils.assertion

assert_equal() - Check if the two objects are the same. Display a nice diff, using pformat()
assert_text_equal() - Check if the two text strings are the same. Display an error message with a diff.
pformat_ndiff() - Generate a ndiff from two objects, using pformat()
pformat_unified_diff() - Generate a unified diff from two objects, using pformat()
text_ndiff() - Generate a ndiff between two text strings.
text_unified_diff() - Generate a unified diff between two text strings.

bx_py_utils.test_utils.context_managers

MassContextManager() - A context manager / decorator that enter/exit a list of mocks.
MassContextManagerExceptions() - Common base class for all non-exit exceptions.

bx_py_utils.test_utils.datetime

parse_dt() - Helper for easy generate a datetime instance via string.

bx_py_utils.test_utils.deny_requests

DenyAnyRealRequestContextManager() - Context manager that denies any request via docket/urllib3. Will raise DenyCallError.
deny_any_real_request() - Deny any request via docket/urllib3. Useful for tests, because they should mock all requests.

bx_py_utils.test_utils.filesystem_utils

FileWatcher() - Helper to record which new files have been created.

bx_py_utils.test_utils.log_utils

NoLogs() - Context manager to Suppress all logger outputs
RaiseLogUsage() - A log handler, that raise an error on every log output.

bx_py_utils.test_utils.mock_aws_secret_manager

SecretsManagerMock() - Mock for bx_py_utils.aws.secret_manager.SecretsManager()

bx_py_utils.test_utils.mock_boto3session

MockedBoto3Session() - Mock for boto3.session.Session()

bx_py_utils.test_utils.mock_uuid

MockUUIDGenerator() - Helper to mock uuid.uuid4() with reproducible results (e.g. for snapshot tests)

bx_py_utils.test_utils.mocks3

A simple mock for Boto3's S3 modules.

PseudoS3Client() - Simulates a boto3 S3 client object in tests

bx_py_utils.test_utils.redirect

RedirectOut() - Redirect stdout + stderr into a buffer (with optional strip the output)

bx_py_utils.test_utils.requests_mock_assertion

assert_json_requests_mock() - Check the requests mock history. In this case all requests must be JSON.
assert_json_requests_mock_snapshot() - Check requests mock history via snapshot. Accepts only JSON requests.
assert_requests_mock() - Check the requests mock history. Accept mixed "text" and "JSON".
assert_requests_mock_snapshot() - Check requests mock history via snapshot. Accept mixed "text" and "JSON".

bx_py_utils.test_utils.snapshot

Assert complex output via auto updated snapshot files with nice diff error messages.

SnapshotChanged() - Assertion failed.
assert_binary_snapshot() - Assert binary data via snapshot file
assert_html_snapshot() - Assert "html" string via snapshot file with validate and pretty format
assert_py_snapshot() - Assert complex python objects vio PrettyPrinter() snapshot file.
assert_snapshot() - Assert given data serialized to JSON snapshot file.
assert_text_snapshot() - Assert "text" string via snapshot file
get_snapshot_file() - Generate a file path use stack information to fill not provided path components.

bx_py_utils.test_utils.time

MockTimeMonotonicGenerator() - Helper to mock time.monotonic() in tests.

bx_py_utils.test_utils.unittest_utils

BaseDocTests() - Helper to include all doctests in unittests, without change unittest setup. Just add a normal TestCase.
assert_no_flat_tests_functions() - Check if there exists normal test functions (That will not be executed by normal unittests)

bx_py_utils.test_utils.xlsx

FreezeXlsxTimes() - Context manager / decorator intended to freeze timestamps of xlsx files creation by e.g.: openpyxl.
generate_xlsx_md_snapshot() - Generate a markdown snapshot of a XLSX: Display ZIP info + Sheets content as Markdown.
xlsx2dict() - Convert a XLSX file content into a dictionary: Every sheet is a key, and the value is a list of dictionaries.
xlsx2markdown() - Convert all Sheets of a XLSX into markdown tables.

bx_py_utils.test_utils.zip_file_utils

FreezeZipFileDatetime() - Context manager / decorator to freezes the modification time of files written to a zip file.
zip_info() - Generates similar information than unzip -v: Yields ZipFileInfo for each file in the zip file.
zip_info_markdown() - Generates a markdown representation of the zip file content. Similar to unzip -v output.

bx_py_utils.text_tools

cutout() - Mark a point in a long text by line no + column with context lines around.

Notes about snapshot

Quick hint about snapshot. If you have many snapshots in your project and you need to change many with a code change, then you can run the tests without a snapshot change leading to an error, by set RAISE_SNAPSHOT_ERRORS=0 in your environment.

e.g.:

RAISE_SNAPSHOT_ERRORS=0 python3 -m unittest

Renew all snapshot files with:

make update-test-snapshot-files

Backwards-incompatible changes

v36 -> v37 - Outsourcing Django stuff

We split bx_py_utils and moved all Django related utilities into the separated project:

https://github.com/boxine/bx_django_utils

So, bx_py_utils is better usable in non-Django projects, because Django will not installed as decency of "bx_py_utils"

developing

To start developing, just run make install to create a .venv and install all needed packages. The minimal requirements are only python3-venv and python3-pip (uv will be installed via pip in .venv)