Python 2/3 unicode CSV compatibility layer


Keywords
unicode, csv, reader, writer
License
MIT
Install
pip install csv23==0.3.4

Documentation

csv23

Latest PyPI Version License Supported Python Versions Wheel

Build Codecov Readthedocs stable Readthedocs latest

csv23 provides the unicode-based API of the Python 3 csv module for Python 2 and 3. Code that should run under both versions of Python can use it to hide the bytes vs. text difference between 2 and 3 and stick to the newer unicode-based interface.

It uses utf-8 as default encoding everywhere.

Improvements

csv23 works around for the following bugs in the stdlib csv module:

bpo-12178
broken round-trip with escapechar if your data contains a literal escape character (fixed in Python 3.10)
bpo-31590
broken round-trip with escapechar and embedded newlines under Python 2 (fixed in Python 3.4 but not backported): produce a warning

Links

Extras

The package also provides some convenience functionality such as the open_csv() context manager for opening a CSV file in the right mode and returning a csv.reader or csv.writer:

>>> import csv23

>>> with csv23.open_csv('spam.csv') as reader:  # doctest: +SKIP
...     for row in reader:
...         print(', '.join(row))
Spam!, Spam!, Spam!'
Spam!, Lovely Spam!, Lovely Spam!'

Python 3 Extras

The read_csv() and write_csv() functions (available on Python 3 only) are most useful if you want (or need to) open a file-like object in the calling code, e.g. when reading or writing directly to a binary stream such as a ZIP file controlled by the caller (emulated with a io.BytesIO below):

>>> import io
>>> buf = io.BytesIO()

>>> import zipfile
>>> with zipfile.ZipFile(buf, 'w') as z, z.open('spam.csv', 'w') as f:
...     csv23.write_csv(f, [[1, None]], header=['spam', 'eggs'])
<zipfile...>

>>> buf.seek(0)
0

>>> with zipfile.ZipFile(buf) as z, z.open('spam.csv') as f:
...     csv23.read_csv(f, as_list=True)
[['spam', 'eggs'], ['1', '']]

csv23 internally wraps the byte stream in a io.TextIOWrapper with the given encoding and newline='' (see csv module docs).

The write_csv()-function also supports updating objects with a .update(<bytes>)-method such as hashlib.new() instances, which allows to calculate a checksum over the binary CSV file output produced from the given rows without writing it to disk (note that the object is returned):

>>> import hashlib

>>> csv23.write_csv(hashlib.new('sha256'), [[1, None]], header=['spam', 'eggs']).hexdigest()
'aed6871f9ca7c047eb55a569e8337af03fee508521b5ddfe7ad0ad1e1139980a'

Both functions have an optional autocompress argument: Set it to True to transparently compress (or decompress) if the file argument is a path that ends in one of '.bz2', '.gz', and '.xz'.

Installation

This package runs under Python 2.7, and 3.7+, use pip to install:

$ pip install csv23

See also

License

This package is distributed under the MIT license.