Epub parser

epub, epub3-parser, cray, epub-parser, epub3
npm install cray@2.1.6


cray npm version

Stream Based EPUB Parser.

What It Does

Cray is a simple way to make working with epub files more programmer-friendly by converting its data structures to JSON and providing some simplified structures for convenient access to that data.

Specifically, it:

  • takes a lot of essential data from any valid Epub file and makes it available in an "easy" JSON namespace
  • takes care of a lot of dirty work in determining the primary ID of a file, the location of the NAV file, the OPS root, etc

Installing As Package

npm install cray


Single Epub File

In a nutshell though, it's as simple as this:

var cray = require('cray')('./test1.epub');

// Triggered when the epub contents have been parsed.
cray.queue[0].on('finish', function() {
  console.log(parser[0]); // Entire EPUB properties accessible in this json

// Triggered when there is something wrong with the file stream
cray.queue[0].on('error', function(err) {

Multiple Epub Files

//The code example above can be modified to use the parser for multiple files.
var cray = require('cray')(['test1.epub', './test2.epub']);

// It will return an array of Epubs with Event that you can watch as 
// described in previous example.
// cray.queue[0] --> 'test1.epub' epub object
// cray.queue[1] --> 'test2.epub' epub object

Data Structure


Important - Accessible only after finish event


// Available din the array object.
// --> cray.queue[0].metadata

// Structure
// Sample -->
/* { language: 'en',
     title: 'Accessible EPUB 3',
     date: '2012-02-20',
     creator: 'Matt Garrish',
      [ 'O'Reilly Production Services',
        'David Futato',
        'Robert Romano',
        'Brian Sawyer',
        'Dan Fauxsmith',
        'Karen Montgomery' ],
     publisher: 'O'Reilly Media, Inc.',
     rights: 'Copyright c 2012 O'Reilly Media, Inc' } */


Important - Accessible only after finish event


// Available in the array object.
// --> cray.queue[0].nav

// Structure
// Sample -->
/* { id: 'htmltoc',
        properties: 'nav',
        'media-type': 'application/xhtml+xml',
        href: 'bk01-toc.xhtml' } */


Important - Accessible only after finish event


// Available in the array object.
// --> cray.queue[0].cover

// Structure
// Sample -->
/* { imagePath: 'covers/9781449328030_lrg.jpg' } */


Important - Accessible only after finish event


// Available in the array object.
// --> cray.queue[0].cover

// Structure {Array}
// Sample -->
/* [ { 'media-type': 'text/css',
          id: 'epub-css',
          href: 'css/epub.css' },
        { 'media-type': 'text/css',
          id: 'epub-tss-css',
          href: 'css/synth.css' } ] */


Important - Accessible only after finish event


// Available in the array object.
// --> cray.queue[0].spines

// Structure {Array}
// Sample -->
/* [ { 'media-type': 'text/css',
          id: 'epub-css',
          href: 'css/epub.css' },
        { 'media-type': 'text/css',
          id: 'epub-tss-css',
          href: 'css/synth.css' } ] */

OPF Root

Important - Accessible only after finish event

Contains the folder name where the opf file is located.


// Available in the array object.
// --> cray.queue[0].opfRoot


  1. EPUB:opf:parsed DEPRECATED This event is triggered when the opf file has been completely traversed by cray. At this point the epub object will contain all the populated data structures regarding the respective epub file.

  2. EPUB:error DEPRECATED Triggered when there is a node fs error w.r.t to the epub file that is parsed.

  3. EPUB:invalid DEPRECATED Triggered when there is an invalid file structure present within the EPUB.

  4. READY Triggered when all epubs in the queue are processed.

// Example
var cray = require('cray')(['./test1.epub', 'test2.epub']);
cray.on('READY', function() {
  console.log("All processed", cray.queue /* This is an array */);

Command Line Tool

Cray can also be used as a command line utility by installing it globally. As described below.

npm install -g cray


cray -e test1.epub,fail.epub

│ Epub Name          │ Directory │ Status                         │
│ test1.epub         │ CWD       │ Valid                          │
│ containerFail.epub │ CWD       │ Container XML file is missing! │

Important Using -v or --verbose, Checks the epub file using IDPF epubcheck tool, requires java to be installed.


You can check out the list of options that come along with using Cray by typing the following command.

cray --help


Check out the examples folder for more info on usage.