A package for installing a database of edgar database filings.


License
MIT
Install
pip install edgerdb==1.1.2.2

Documentation

edgerdb

This is a package that builds a local postgres database of SEC filings taken from ftp.sec.gov

This package can be pip installed into the desired directory:

pip install edgerdb

To create the database and insert the index files from ftp.sec.gov do the following:

from edgerdb import EdgerDb
edger = EdgerDb()
edger.create_and_load()

This installs a database with three tables.

  • filings
  • loaded_master_files
  • last_updated

filings is the table that will contain information on all the SEC filings.

loaded_master_files contains a list of all the files currently loaded into the filings table

last_updated has the time that the last file was loaded into the database


To remove the database and user run:

edger.delete_everything()

Some functions are built in and can be used by importing helper_functions:


from edgerdb import helper_functions as hlp

The most used functions will be db(), old_db(), statement(), clear_sessions() and retrieve_document().

db() is used to create a open a connection object with the postgres database.

It is important to close the connection after every operation is performed.


con = hlp.db()

con.close()

statement() is used to run SQL queries on the database. statement() takes in the sql query as a string, a connection object and has optional keyword arguments. If close defaults to True to automatically close the connection after the query is run.

  statement(statement, connection, commit=False, close=True, output=True)

Ex:

top_five_paths = hlp.statement("select path from filings limit 5;", hlp.db(), close=True)

retrieve_document() requires a path to file from filings table. It takes this as input and downloads a copy of the file from edgar and stores it in a "sec_filings" directory in the same directory as your project. This can be changed with the optional directory keyword argument.

Ex:

for path in top_five_paths:
    hlp.retrieve_document(path)

clear_sessions() can be used to clear running sessions on either the sec database or the main postgres database. The function requires two arguments.

clear_sessions(dbname, connection)

dbname is the name of the database and connection is a connection object. To clear sessions on the edgar database use db() and for the generic database use old_db().

Ex:

hlp.clear_sessions('edgar', hlp.db())
hlp.clear_sessions('edgar', old_db())

dir() can be used to explore the other functions that come with helper_functions

dir(hlp)