
Bird Feeder publishes Thredds metadata catalogs to a Solr index service with birdhouse schema.

Bird Feeder

Feed the Birds ...

Bird Feeder is parsing Thredds catalogs and local directories with NetCDF files and publishes metadata with download URLs to a Solr index service with a birdhouse schema.

Install from Anaconda

$ conda install -c birdhouse bird-feeder

Install from GitHub

$ git clone
$ cd bird-feeder
$ make install

Start Solr service on http://localhost:8983/solr/birdhouse:

$ make start
$ make status

Using the command line


$ birdfeeder -h
usage: birdfeeder [<options>] <command> [<args>]

  Feeds Solr with Datasets (NetCDF Format) from Thredds Catalogs and File

  optional arguments:
    -h, --help            show this help message and exit
    -v                    enable verbose mode
    --service SERVICE     Solr URL. Default:
    --maxrecords MAXRECORDS
                          Maximum number of records to publish. Default: -1
    --batch-size BATCH_SIZE
                          Batch size of records to publish. Default: 50000

    List of available commands

                          Run "birdfeeder <command> -h" to get additional help.
      spider              Runs spider to crawl NetCDF files on a HTTP file
                          service and writes the path list to a CSV file.
      walker              Runs walker to crawl NetCDF files from filesystem and
                          writes the path list to a CSV file.
      clear               Clears the complete solr index. Use with caution!
      from-thredds        Publish datasets from Thredds Catalog to Solr.
      from-walker         Publish NetCDF files from directory to Solr.
      from-spider         Runs spider to crawl NetCDF files on a HTTP file
                          service and publishes them to Solr.

Parse a Thredds catalog (recursively until depth level 2) and publish to Solr:

$ birdfeeder from-thredds --catalog-url --depth=2

Parse NetCDF files from local directory and publish to Solr:

$ birdfeeder from-walker --start-dir /home/data/myarchive

Run spider to get NetCDF file URLs from HTTP file service and write ot CSV file:

$ birdfeeder spider --url --depth 2 -o out.csv