censuspy

Lightweight wrapper to access gov data


Keywords
census, python, wrapper, census-api, census-bureau, census-data, python3
License
MIT
Install
pip install censuspy==1.1.0

Documentation

CensusPy 1.1.0

The goal of CensusPy is to expose the vast amount of data the government collects on US citizens to the broader programming community. Written as a wrapper around existing census APIs, CensusPy 1.1.0 currently supports:

But, the end goal will be to support all databases provided by the Census Bureau.

Table of Contents

Installation

CensusPy is supported on PyPi, so installation is as simple as:

pip install censuspy

CensusPy only supports Python >= 3.0

Business Dynamics Statistics (BDS)

Overview (BDS)

The Business Dynamics Statistics (BDS) includes measures of establishment openings and closings, firm startups, job creation and destruction by firm size, age, and industrial sector, and several other statistics on business dynamics. The BDS is made up of only one sub-dataset.

Quickstart (BDS)

Initialize the BDS object using your API key & geographic level of query:

from censuspy import bds
state = bds.bds(api_key=[YOUR_API_KEY_HERE], geo='state')

Pull total employment numbers for Massachusetts (FIPS code 25) in 2014:

ma_emp = state.get(metric='emp', code=25, time=2014)
print(ma_emp)

Parameters (BDS)

  • metric (required)
  • code (conditionally required)
    • specify state or metro FIPS code
    • only required if geographic level != us
    • FIPS state codes
  • time (required)
    • specify time period
    • acceptable values include 1976 - 2014
    • might not return results for every year if no data for specific geo
  • sic1 (optional)
    • specify industry sector
    • default = 0 (all included)
    • options listed on BDS website
  • fage4 (optional)
    • specify firm age
    • default = 'm' (all included)
    • options listed on BDS website
  • fsize (optional)
    • specify firm size
    • default = 'm' (all included)
    • options listed on BDS website
  • ifsize (optional)
    • specify initial firm size
    • default = 'm' (all included)
    • options listed on BDS website

Other Documentation (BDS)

Annual Survey of Entrepreneurs (ASE)

Overview (ASE)

The Annual Survey of Entrepreneurs (ASE) supplements the 5-year Survey of Business Owners (SBO) program and provides more timely updates on the status, nature, and scope of women-, minority-, and veteran-owned businesses for 2014. The ASE has three sub-datasets:

  • Company Summary (CSA)
  • Characteristics of Businesses (CSCB)
  • Characteristics of Business Owners (CSCBO)

Quickstart (ASE)

Initialize the ASE object using your API key & geographic level of query, then specify the dataset that you want to access. In this example we will work with the Company Summary (CSA) dataset:

from censuspy import ase
state = ase.csa(api_key=[YOUR_API_KEY_HERE], geo='state')

Pull total employment numbers for Massachusetts (FIPS code 25) in 2014:

ma_emp = state.get(metric='emp', code=25)
print(ma_emp)

Overview (CSA)

Provides data for employer businesses by sector, gender, ethnicity, race, veteran status, years in business, receipts size of firm, and employment size of firm for the U.S., states, and the fifty most populous metropolitan statistical areas (MSAs).

Parameters (CSA)

Other Documentation (CSA)

Overview (CSCB)

Provides data for employer firms by sector, gender, ethnicity, race, veteran status, and years in business for the U.S., states, and fifty most populous MSAs, including detailed business characteristics.

Parameters (CSCB)

Other Documentation (CSCB)

Overview (CSCBO)

Provides data for owners of respondent employer firms by sector, gender, ethnicity, race, veteran status, and years in business for the U.S., states, and top fifty most populous MSAs, including detailed owner characteristics.

Parameters (CSCBO)

Other Documentation (CSCBO)

Decennial Census Surnames Files (DCSF)

Overview (DCSF)

The Census Bureau's Census surnames contains rank and frequency data on surnames reported 100 or more times in the decennial census, along with Hispanic origin and race category percentages. The latter are suppressed where necessary for confidentiality. The data focus on summarized aggregates of counts and characteristics associated with surnames, and the data do not in any way identify any specific individuals.

Quickstart (DCSF)

Initialize the DCSF object using your API key & time parameter (2010 or 2000):

from censuspy import dcsf
us2010 = dcsf.dcsf(api_key=[YOUR_API_KEY_HERE], time=2010)

Pull ranking and count of reported occurences for "Smith" as a surname:

us2010_smith = us2010.get(metric='count', name="Smith")

# the wrapper will return a dictionary with three keys: metric, rank, and name
# metric will be whatever is passed in the metric parameter (count in this ex.)

print(us2010_smith['rank']) # will yield the rank of Smith
print(us2010_smith['metric']) # will yield the count

Parameters (DCSF)

  • metric (required)
  • time (required)
    • specify time period
    • options include 2010 or 2000
  • name (conditionally required)
    • specify the surname you'd like search for
    • will return "N/A" if surname is not available
  • rank (conditionally required)
    • specify a surname rank to search on
    • will return "N/A" if rank is not available
  • Either name or rank need to be specified otherwise the wrapper will raise a ValueError for missing parameters

Other Documentation (DCSF)

Goals

Broadly speaking, my goal is to cover all the business-focused datasets before moving to the purely demographic data. The main motivation behind that is personal, since I'm deriving personal value from developing this wrapper. That being said -- if there is significant interest in exposing a specific dataset, then I'm more than happy to entertain that as well. Please feel free to send any requests to dnrkaseff360@gmail.com.

Roadmap:

  • Annual Survey of Entrepreneurs (March 2018) [DONE]
  • Decennial Census Surname Files (March 2018) [DONE]
  • County Business Patterns and Nonemployer Statistics (April 2018)
  • Economic Census (May 2018)
  • Economic Indicators (June 2018)

Changelog

  • 0.0.1: initial beta release
  • 0.0.2: hot fix to allow imports of specific database wrappers instead of having to import the entire package
  • 1.0.0: go live! added support for ASE and implemented minor code changes to make calls more efficient from a resource perspective
  • 1.1.0 added support for DCSF

License

MIT License

Copyright (c) 2018 DnrkasEFF

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.