pdfforms

Populate fillable pdf forms from csv data file


Keywords
pdf, data, forms
License
MIT
Install
pip install pdfforms==1.0.0

Documentation

pdfforms

pdfforms is a small utility for populating fillable pdf forms from a spreadsheet data source. It was created with the intent of filling US tax forms using tax data prepared with a spreadsheet, but should be equally applicable to other forms.

Features

  • Assigns numeric id for each field
  • Generates test pdf showing ids of text fields
  • Merges spreadsheet data into final filled pdf
  • Works with multiple spreadsheet formats
  • Can process multiple pdfs at a time
  • Can be used as a library or CLI
  • Optional rounding and number formatting

Requirements

pdfforms requires Python 3.5 or higher, pyexcel for data loading, and pdftk, which does all the real work.

Installation

To install: pip install pdfforms

Documentation

For complete documentation, see https://pdfforms.readthedocs.io/

Example

Let's say you have a spreadsheet with your tax calculations. You want to populate your tax forms with the data from the spreadsheet. pdfforms allows you to do so with the following steps:

  1. First pdfforms must inspect the forms to be filled. pdfforms will extract a list of fields in each of the specified documents. Each field is assigned a numeric id, and test documents are generated with filled forms, showing the id of each text field:

    $ pdfforms inspect f1040*.pdf
    f1040sse.pdf
    f1040sce.pdf
    f1040.pdf
    

    The filled test pdfs are stored by default in the test/ subdirectory.

  2. Browse the test pdf files and add the field numbers of the fields you need to fill to your spreadsheet. pdfforms only reads the first and third columns of the datafile. The first column should contain the name of the pdf file with the form to fill and the field numbers. The third column should contain the data to be written into the field. The rest of the sheet is ignored, so you can use it for notes, calculations, etc.

    pdfforms is case sensitive! The file name in the spreadsheet must match exactly the name of the pdf to be filled.

    Below is an example spreadsheet for a (fictional) 2016 tax return.

    f1040.pdf

    Form 1040

     

    2016

     

    3

    First Name and initial

    John Q

       

    4

    Last Name

    Public

       

    5

    SSN

    321546789

    321-54-6789

     

    6

    Spouse's Name

    Susie

       

    7

    Spouse's Last Name

    Public

       

    8

    SSN

    132458697

    132-45-8697

     

    9

    Address

    5776 Winding Ln

       

    11

     

    Springfield, MA

       

    18

    Filing status

    MJ

       

    24

    Exemption - self

    1

       

    25

    Exemption - spouse

    1

       

    27

    Dependent name

    Timothy Public

       

    28

    Dependent ssn

    531248680

    531-24-8680

     

    29

    Dependent relationship

    Son

       

    30

    Dependent under 17

    1

       

    31

    Dependent name

    Abigail Public

       

    32

    Dependent ssn

    428775031

    428-77-5031

     

    33

    Dependent relationship

    Daughter

       

    34

    Dependent under 17

    1

       

    45

    Line 6a

    2

       

    46

    Line 6c

    2

       

    49

    Line 6d

    4

       

    50

    Line 7

    60,000

    salaries

     

    52

    Line 8a

    124

    taxable interest

     

    64

    Line 12

    15,000

    business income

     

    92

    Line 22

    75,124

    total income

     

    102

    Line 27

    1,060

    half SE tax

     

    121

    Line 36

    1,060

       

    123

    Line 37

    74,064

    Adjusted Gross Income

     

    125

    Line 38

    74,064

       

    133

    Line 40

    12,600

    Standard Deduction

     

    135

    Line 41

    61,464

       

    137

    Line 42

    16,200

    Exemptions

    $ 4,050

    139

    Line 43

    45,264

    Taxable income

     

    145

    Line 44

    4,528

    Tax

     

    151

    Line 47

    4,528

       

    161

    Line 52

    2,000

    Child Tax Credit

     

    171

    Line 55

    2,000

    Total Credits

     

    173

    Line 56

    2,528

       

    175

    Line 57

    2,119

    Self-employment tax

     

    196

    Line 63

    4,647

    Total Tax

     

    198

    Line 64

    8,688

    Tax withheld

     

    225

    Line 74

    8,688

    Total Payments

     

    227

    Line 75

    4,041

    Amount you overpaid

     

    230

    Line 76a

    4,041

    Amount you want refunded

     

    232

    Line 76b

    123654789

    Routing Number

     

    234

    Line 76c

    Savings

    Account Type

     

    235

    Line 76d

    135724

    Account Number

     

    247

    Occupation

    Salesman

       

    248

    Daytime phone number

    413-555-1212

       

    249

    Spouse's Occupation

    Artist

       
             

    f1040sce.pdf

    Schedule C-EZ

         

    0

    Name

    Susie Public

       

    1

    SSN

    132-45-8697

       

    9

    Line F

    2

    No

     

    2

    Line A

    Artist

    Principle business or profession

     

    3

    Line B

    711510

    Business Code

     

    13

    Line 1

    22,000

    gross receipts

     

    15

    Line 2

    7,000

    total expenses

     

    17

    Line 3

    15,000

    net profit

     
             

    f1040sse.pdf

    Form SE - Section A Short Schedule SE

         

    0

    Name

    Susie Public

       

    1

    SSN

    132-45-8697

       

    6

    Line 2

    15,000

       

    8

    Line 3

    15,000

    92.35%

     

    10

    Line 4

    13,853

    15.30%

     

    12

    Line 5

    2,119

    50.00%

     

    14

    Line 6

    1,060

       

    The test pdfs do not show field numbers for checkboxes. Currently the only way to fill checkboxes is to examine the fields.json file and find the field number and allowed values of the checkbox.

  3. Once the file name and field numbers have been added to your spreadsheet, save the spreadsheet as a csv file and fill the forms:

    $ pdfforms fill mydata.csv
    f1040sse.pdf
    f1040sce.pdf
    f1040.pdf
    

    The final, populated pdf files are saved by default to the filled/ subdirectory.