lobbyists

Parse Senate LD-1/LD-2 lobbying disclosure XML documents


Keywords
lobbyists, government, parser
License
GPL-3.0
Install
pip install lobbyists==0.10

Documentation

SUMMARY
-------

This package provides a reference parser and database importer for the
United States Senate LD-1/LD-2 lobbying disclosure database. The
Senate provides the database as a series of XML documents,
downloadable here:

http://www.senate.gov/legislative/Public_Disclosure/database_download.htm

The SQL database schema used by the importer is a direct translation
of the XML schema used in the Senate documents. This isn't a
particularly useful format for analyzing lobbying data, but it is
useful for analyzing the lobbying records themselves, which often
contain errors or anomalies. In any case, it shouldn't be too
difficult to adapt the importing code in this package to a more useful
schema.

A document describing how to interpret the LD-1/LD-2 database used to
be maintained at http://watchdog.jottit.com/lobbying_database.
Unfortunately, that domain is no longer functioning. A cache of that
document as of July 2, 2015 can be found here:

http://archive.is/Alo68

REQUIREMENTS
------------

This package requires Python 2.5.1 or later.


SCRIPTS
-------

The lobbyists-load script loads one or more XML documents into a
database.

The lobbyists-benchmark script loads one XML document into a database,
and reports the amount of time required to a) parse the document and
b) import the parsed records into the database. It's mainly
interesting for developers working on the lobbyists package itself.