signature-based file format identification

code4lib, digital-preservation, format-identification, pronom
go get



Siegfried is a signature-based file format identification tool, implementing:

  • the National Archives UK's PRONOM file format signatures
  •'s MIME-info file format signatures.



Build Status GoDoc


Command line

sf file.ext
sf DIR


sf -csv file.ext | DIR                     // Output CSV rather than YAML
sf -json file.ext | DIR                    // Output JSON rather than YAML
sf -droid file.ext | DIR                   // Output DROID CSV rather than YAML
sf -                                       // Read list of files piped to stdin
sf -nr DIR                                 // Don't scan subdirectories
sf -z | DIR                       // Decompress and scan zip, tar, gzip, warc, arc
sf -hash md5 file.ext | DIR                // Calculate md5, sha1, sha256, sha512, or crc hash
sf -sig custom.sig file.ext                // Use a custom signature file
sf -home c:\junk -sig custom.sig file.ext  // Use a custom home directory
sf -serve hostname:port                    // Server mode
sf -version                                // Display version information
sf -throttle 10ms DIR                      // Pause for duration (e.g. 1s) between file scans
sf -log [comma-sep opts] file.ext | DIR    // Log errors etc. to stderr (default) or stdout
sf -log e,w file.ext | DIR                 // Log errors and warnings to stderr
sf -log u,o file.ext | DIR                 // Log unknowns to stdout
sf -log d,s file.ext | DIR                 // Log debugging and slow messages to stderr
sf -log p,t DIR > results.yaml             // Log progress and time while redirecting results



Signature files

By default, siegfried uses the latest PRONOM signatures without buffer limits (i.e. it may do full file scans). To use MIME-info signatures, or to add buffer limits or other customisations, use the roy tool to build your own signature file.


With go installed:

go get

sf -update

Or, without go installed:


Download a pre-built binary from the releases page. Unzip to a location in your system path. Then run:

sf -update

Mac Homebrew (or Linuxbrew):

brew install mistydemeo/digipres/siegfried

Ubuntu/Debian (64 bit):

wget -qO - | sudo apt-key add -
echo "deb wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update && sudo apt-get install siegfried

Recent Changes

Version 1.5.0 (14/3/2016)

  • feature: implement MIME-info signatures (and the Apache Tika variant)
  • feature: implement XML matcher
  • feature: file name matcher now supports glob patterns as well as file extensions
  • default signature file now "default.sig" (was "pronom.sig")
  • changes to YAML and JSON output: "ns" (for namespace) replaces "id", and "id" replaces "puid"
  • changes to CSV output: multi-identifiers now displayed in extra columns, not extra rows

Version 1.4.5 (6/2/2016)

Version 1.4.4 (9/1/2016)

  • fix: speed regression in TIFF mis-identification patch last release
  • code quality: refactor textmatcher package
  • code quality: refactor siegreader package
  • code quality: documentation

Version 1.4.3 (19/12/2015)

Version 1.4.2 (27/11/2015)

Version 1.4.1 (6/11/2015)

  • -log replaces -debug, -slow, -unknown and -known flags (see usage above)
  • highlight empty file/stream with error and warning
  • negative text match overrides extension-only plain text match

Version 1.4.0 (31/10/2015)

  • new MIME matcher; requested by Dragan Espenschied
  • support warc continuations
  • add all.json and tiff.json sets
  • minor speed-up
  • report less redundant basis information
  • report error on empty file/stream

Full change history


Copyright 2016 Richard Lehane

Licensed under the Apache License, Version 2.0


Like siegfried and want to get involved in its development? That'd be wonderful! There are some notes on the wiki to get you started, and please get in touch.


Thanks TNA for and

Thanks Ross for and, both are very handy!

Thanks Misty for the brew and ubuntu packaging