A package for installing a database of edgar database filings.

This is a package that builds a local postgres database of SEC filings taken from ftp.sec.gov

This package can be pip installed into the desired directory:

pip install edgerdb

To create the database and insert the index files from ftp.sec.gov do the following:

from edgerdb import EdgerDb
edger = EdgerDb()

This installs a database with three tables.

  • filings
  • loaded_master_files
  • last_updated

filings is the table that will contain information on all the SEC filings.

loaded_master_files contains a list of all the files currently loaded into the filings table

last_updated has the time that the last file was loaded into the database

To remove the database and user run:


Some functions are built in and can be used by importing helper_functions:

from edgerdb import helper_functions as hlp

The most used functions will be db(), old_db(), statement(), clear_sessions() and retrieve_document().

db() is used to create a open a connection object with the postgres database.

It is important to close the connection after every operation is performed.

con = hlp.db()


statement() is used to run SQL queries on the database. statement() takes in the sql query as a string, a connection object and has optional keyword arguments. If close defaults to True to automatically close the connection after the query is run.

  statement(statement, connection, commit=False, close=True, output=True)


top_five_paths = hlp.statement("select path from filings limit 5;", hlp.db(), close=True)

retrieve_document() requires a path to file from filings table. It takes this as input and downloads a copy of the file from edgar and stores it in a "sec_filings" directory in the same directory as your project. This can be changed with the optional directory keyword argument.


for path in top_five_paths:

clear_sessions() can be used to clear running sessions on either the sec database or the main postgres database. The function requires two arguments.

clear_sessions(dbname, connection)

dbname is the name of the database and connection is a connection object. To clear sessions on the edgar database use db() and for the generic database use old_db().


hlp.clear_sessions('edgar', hlp.db())
hlp.clear_sessions('edgar', old_db())

dir() can be used to explore the other functions that come with helper_functions