zstdarchiver

The Z-Archiver. A fast archiving module.


Keywords
zip, archiving
License
MIT
Install
pip install zstdarchiver==0.2.1

Documentation

The ZStd-Archiver project is for fast file archiving with good compression ratio

Background

In the company I'm working for we have a lot of big archives for our daily work. 7z is a great open source tool, but with big archives it's getting slow. Especially on Windows. Pretty slow actions are uncompressing and searching in archives. Especially search actions when the archives are on a share and compressed with LZMA2.

So I've started a new project with the following goals:
  • Fast in all actions, like compress, uncompress, search, add/remove/update single files
  • Fast when navigating in archive
  • Parallelize actions as much as possible for best resource usage and speed
  • Meta data section for custom data (for e.g. very fast common search actions)
  • Delta archives with linked baseline archives (for saving space)
  • Good compression ratio

Status

At the moment the project is in a very early development state.

Currently implemented:
  • Python module for (un-)compressing single directories
  • Plugins for meta data and encryption
  • Command line tool with following features:
    • Compress single directory
    • Uncompress archive
    • Show archive information
    • Show meta data

Installation

Dependencies:

The zstdarchiver module has following dependencies:
  • msgpack
  • zstandard
  • fastthreadpool
  • cryptography (for encryption plugin)

First install the module in module folder with:

python setup.py bdist_wheel
pip install dist/...

Optional you can add cython keyword after bdist_wheel to compile the zstdarchiver module with Cython. But currently the performance improvements are very little.

Now you can use the command line zarc.py in apps folder or compile it to a single executable with:

pyinstaller zarc.spec

Command line tool

The command line tool is called zarc.py and can be found in folder apps.

Usage:

zarc.py [-h] [--version] [-i INPUT] [-o OUTPUT] [-m META [META ...]]
[-p PASSWORD] [{a,add,e,extract,i,info}]

Compress a folder:

zarc a -i <Path to folder> -o <Archive name>.zar

Uncompress an archive:

zarc e -i <Archive name>.zar -o <Path to base folder>

Show archive information:

zarc i -i <Archive name>.zar

Show meta data:

zarc m -i <Archive name>.zar

Benchmarks

Custom compiled source tree: 4.85GB, 34275 files in 4452 folders on Windows 10 with Laptop Core i7-4810MQ @ 2.8GHz Laptop, MTF SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 36 95 624
7z-ZIP-L5 146 95 552
7z-LZMA2-L3 51 85 352
7z-LZMA2-L5 304 84 283
7z-LZ4-L6 25   781
7z-L4-L12 97   774
7z-ZStd-L11 66 105 387
7z-ZStd-L13 120   381
ZAR 37 20 386

Firefox sources (gecko-dev-master): 1.74GB, 280831 files in 21396 folders on Windows 10 with Laptop Core i7-4810MQ @ 2.8GHz Laptop, MTF SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 71 815 653
7z-ZIP-L5 101 812 628
7z-LZMA2-L3 118 826 391
7z-LZMA2-L5 242 829 328
ZAR 48 155 379

Custom compiled source tree: 4.85GB, 34275 files in 4452 folders on Windows 10 with Desktop Core i7-7700 @ 3.6GHz Desktop, Samsung SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 21 53 624
7z-ZIP-L5 84 52 552
7z-LZMA2-L3 31 45 352
7z-LZMA2-L5 187 45 283
ZAR 24 13 386

Firefox sources (gecko-dev-master): 1.74GB, 280831 files in 21396 folders on Windows 10 with Desktop Core i7-7700 @ 3.6GHz Desktop, Samsung SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 30 407 653
7z-ZIP-L5 52 413 628
7z-LZMA2-L3 87 413 391
7z-LZMA2-L5 165 395 328
ZAR 27 65 379

Custom compiled source tree: 4.85GB, 34275 files in 4452 folders on Windows 10 with Desktop Xeon E5-1620 @ 3.5GHz Desktop, SanDisk SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 31 108 624
7z-ZIP-L5 103 107 552
7z-LZMA2-L3 40 96 352
7z-LZMA2-L5 224 108 283
ZAR 29 69 386

Firefox sources (gecko-dev-master): 1.74GB, 280831 files in 21396 folders on Windows 10 with Desktop Xeon E5-1620 @ 3.5GHz Desktop, SanDisk SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 58 889 653
7z-ZIP-L5 80 885 628
7z-LZMA2-L3 112 897 391
7z-LZMA2-L5 200 897 328
ZAR 38 148 379

Firefox sources (gecko-dev-master): 1.74GB, 280831 files in 21396 folders on Linux with Ryzen 5 2400G and SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 14 50 658
7z-ZIP-L5 39 50 632
7z-LZMA2-L3 85 57 394
7z-LZMA2-L5 199 55 332
TAR GZ 47 13 486
ZIP 51 20 654
ZAR 32 13 389

linux-5.2 compiled source tree: 2.85GB, 165891 files in 15222 folders on Linux with Ryzen 5 2400G and SSD

Compressor Compress time [s] Uncompress time [s] Size [MB]
7z-ZIP-L3 17 41 792
7z-ZIP-L5 70 40 749
7z-LZMA2-L3 56 48 420
7z-LZMA2-L5 221 44 315
TAR GZ 81 17 707
ZIP 79 23 786
ZAR 33 8 426