Transform RSS or Atom XML to JSON


Keywords
rss, atom, xml, json, itunes, podcast, feed, transform, stream, through, streams2, atom-feed-parser, javascript, rss-feed-parser
License
MIT
Install
npm install pickup@6.0.1

Documentation

Build Status Coverage Status

pickup - transform feeds

The pickup Node package transforms XML feeds to JSON or objects. Exporting a Transform stream, it parses RSS 2.0, including iTunes namespace extensions, and Atom 1.0 formatted XML, producing newline separated JSON strings or objects.

Usage

Command-line

$ export URL=https://www.newyorker.com/feed/posts
$ curl -sS $URL | node ./example/stdin.js

The Standard Input example is more or less the CLI binary of pickup, ./bin/cli.js, which you can install globally if you want to.

npm i -g pickup

đź’ˇ You can use pickup, node ./example/stdin.js, or node ./bin/cli.js in these examples.

Piping data to pickup, you can now, for example, parse feeds from the network using curl on the command-line.

$ curl -sS $URL | pickup

For convenient JSON massaging, I use json, available on npm.

$ npm install -g json

Now you can extend your pipe for correct formatting…

$ curl -sS $URL | pickup | json -g

…and handy filtering.

$ curl -sS $URL | pickup | json -ga title

REPL

Being a library user of this module, my preferred interface for exploring, and testing while working on it, is its REPL.

$ ./repl.js
pickup> let x = get('https://www.newyorker.com/feed/posts')
pickup> read(x, Entry, 'title')
pickup> 'An Asylum Seeker’s Quest to Get Her Toddler Back'
'At N.H.L.’s All-Star Weekend, Female Players Prove Their Excellence, but NBC Can’t Keep Up'
'Cliff Sims Is Proud to Have Served Trump'
'The Slimmed-Down and Soothing New “Conan”'
'Donald Trump’s Chance to Bring Peace to Afghanistan and End America’s Longest War'
'The First Days of Disco'
'How Various Government Agencies Celebrated the End of the Shutdown'
'Why We Care (and Don’t Care) About the New Rules of Golf'
'Howard Schultz Against the Hecklers'
'Daily Cartoon: Tuesday, January 29th'
ok
pickup>

Look at ./repl.js to see what it can do. Not much, but enough for exploring exotic feeds.

Library

Transforming from stdin to stdout

Here’s the Standard Input example from before again.

const pickup = require('pickup')

process.stdin
  .pipe(pickup())
  .pipe(process.stdout)

You can run this example from the command-line:

$ curl -sS $URL | node example/stdin.js | json -g

Proxy server

const http = require('http')
const https = require('https')
const pickup = require('../')

http.createServer((req, res) => {
  https.get('https:/'.concat(req.url), (feed) => {
    feed.pipe(pickup()).pipe(res)
  })
}).listen(8080)

To try the proxy server:

$ node example/proxy.js &
$ curl -sS http://localhost:8080/www.newyorker.com/feed/posts | json -ga title

Types

str()

This can either be a String(), null, or undefined.

opts()

The options Object is passed to the Transform stream constructor.

pickup uses following additional options:

  • eventMode Boolean defaults to false, if true readable state buffers are not filled and no 'data', but 'feed' and 'entry' events are emitted.

  • charset str() | 'UTF-8' | 'ISO-8859-1' An optional string to specify the encoding of input data. In the common use case you received this string in the headers of your HTTP response before you began parsing. If you, not so commonly, cannot provide the encoding upfront, pickup tries to detect the encoding, and eventually defaults to 'UTF-8'. The charset option is corresponding to the optional charset MIME type parameter found in Content-Type HTTP headers. It's OK to pass any string, pickup will fall back on 'UTF-8' when confused.

url()

An undefined property, not populated by the parser, to identify feeds and entries in upstream systems, without prompting V8 to create new hidden classes.

feed()

  • author str()
  • copyright str()
  • id str()
  • image str()
  • language str()
  • link str()
  • originalURL url()
  • payment str()
  • subtitle str()
  • summary str()
  • title str()
  • ttl str()
  • updated str()
  • url url()

enclosure()

  • length str()
  • type str()
  • url str()

entry()

  • author str()
  • duration str()
  • enclosure enclosure() or undefined
  • id str()
  • image str()
  • link str()
  • originalURL url()
  • subtitle str()
  • summary str()
  • title str()
  • updated str()
  • url url()

Event:'feed'

feed()

Emitted when the meta information of the feed gets available.

Event:'entry'

entry()

Emitted for each entry.

Exports

pickup(opts())

pickup exports a function that returns a Transform stream which emits newline separated JSON strings, in objectMode the 'data' event contains entry() or feed() objects. As per XML's structure the last 'data' event usually contains the feed() object. In eventMode neither 'readable' nor 'data' events are emitted, instead 'feed' and 'entry' events are fired.

Installation

With npm, do:

$ npm install pickup

To use the CLI (as above):

$ npm install -g pickup

Contribute

Please create an issue if you encounter a feed that pickup fails to parse.

License

MIT license