rbcz

library for interacting with Czech Raiffeisen Bank's text bank statements


Keywords
banking, raiffeisen, czech, cz
License
Other
Install
pip install rbcz==0.6

Documentation

rbcz.py Build Status Coverage Status

rbcz is a Python library for parsing the plain-text bank statements that Raiffeisen Bank send out via email. It exposes a simple API to either parse statements stored on your local filesystem or to search through your email and retrieve them via IMAP.

Install

Either retrieve from pypi using pip:

$ pip install rbcz

or clone this repo, and install using setup.py:

$ git clone https://github.com/smcl/rbcz.py
$ cd rbcz.py
$ python setup.py install

Methods

There are three simple functions - read_statement, read_statements and read_statements_from_imap. To parse a single statement we can use the read_statement function, which takes a single parameter - the path to the bank statement on the local filesystem - and returns a Statement object:

from rbcz import *
statement = rbcz.read_statement("/path/to/stmt_january_czk.txt")

If we have a number of statements locally we can use read_statements which accepts a list of filenames to parse, and returns a list of Statement:

from rbcz import *

statement_filenames = [
    "stmt_jan_czk.txt",
    "stmt_feb_czk.txt",
    "stmt_mar_czk.txt"
]

statements = rbcz.read_statements(statement_filenames)

If we don't have all our statements stored locally we can use read_statements_from_imap to connect to an IMAP server and search it for emails from the "info@rb.cz" address, download and parse the attachments and return a list of Statement.

from rbcz import *

statements = read_statements_from_imap("imap.gmail.com", "my.email.address@gmail.com", "password123", "inbox")

Types

There are two types - Statement and Movement.

Statement

A Statement represents a monthly statement:

  • account_name - (string) the name of the main account holder (your name!)
  • account_number - (string) your account number
  • iban - (string) the IBAN of your account
  • currency - (string) the currency the account holds
  • number - (int) the number of the statement (your first statement will be 1)
  • from_date - (datetime) the opening date of the statement
  • to_date - (datetime) the closing date of the statement
  • opening_balance - (Decimal) the balance at the opening date of the statement
  • income - (Decimal) the income you've received during the statement's reporting period
  • expenses - (Decimal) the expenses you've paid out during the statement's reporting period
  • closing_balance - (Decimal) the balance at the closing date of the statement
  • blocked - (Decimal) amount ringfenced for payments out
  • receivable - (Decimal) amount received but yet to clear/settle
  • available_balance - (Decimal) amount of money available to withdraw at the closing date of the statement
  • movements - (List of Movement) the individual cash movements (payments in or out) during the reporting period

Movement

A Movement is an individual transaction - for example an ATM withdrawal or Debit Card payment. Each Statement will have a list of Movement called movements for all the transactions during the reporting period. Each Movement has the following:

  • number - (int) id of the movement in the current statement
  • amount - (Decimal) amount of the thing
  • date_deducted - (datetime) the date the transaction was submitted originally
  • date_completed - (datetime) the date + time the transaction was finalised at
  • counterparty_account_number - (string) the account the payment was sent to or received from
  • counterparty_details - (string) information about the account the payment was sent to or received from, if available
  • narrative - (string) additional information about the transaction
  • transaction_type - (string) what type of transaction occurred
  • specific_symbol - (string) specific symbol for movement
  • variable_symbol - (string) variable symbol for movement
  • constant_symbol - (string) constant symbol for movement

Example

The following script will attempt to parse all the statements in the ./rb directory, then take the closing balance and high/low water marks of each period and plot it on a graph.

#!/usr/bin/python

# system/lib imports
import os
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.dates import YearLocator, MonthLocator, DateFormatter, drange, date2num
from numpy import arange

# rbcz library
from rbcz import *

# load and sort the statements
statements = sorted(
    rbcz.read_statements([ "./rb/" + f for f in os.listdir("./rb") ]),
    key=lambda stmt: stmt.from_date)

# function to deterine high/low-water mark on account
def high_low_water(stmt):
    bal = stmt.opening_balance
    hwm = bal
    lwm = bal
    for m in stmt.movements:
        bal += m.amount
        if bal > hwm:
            hwm = bal
        if bal < lwm:
            lwm = bal
    return (lwm, hwm)

#plt.gca().set_color_cycle(['green', 'black', 'red'])


# extract high/low-water marks
water_marks = [ high_low_water(s) for s in statements ]
low_water_marks = [ wm[0] for wm in water_marks ]
high_water_marks = [ wm[1] for wm in water_marks ]

# extract closing balance and dates
closing_balances = [ s.closing_balance for s in statements ]
dates = date2num([ s.from_date for s in statements ])

# prepare and display the chart using matplotlib
y = arange(len(dates)*1.0)

# plot the data
fig, ax = plt.subplots()
ax.set_color_cycle(['green', 'black', 'red'])
ax.plot_date(dates, high_water_marks, "o-")
ax.plot_date(dates, closing_balances, "o-")
ax.plot_date(dates, low_water_marks, "o-")

# fix up the axes
ax.xaxis.set_major_locator(YearLocator())
ax.xaxis.set_minor_locator(MonthLocator())
ax.xaxis.set_major_formatter(DateFormatter('%Y-%m-%d'))

ax.fmt_xdata = DateFormatter('%Y-%m-%d')
fig.autofmt_xdate()

# add a legend
ax.legend(['highest', 'closing', 'lowest'], loc='upper left')

plt.show()

Depending on the content of the bank statements this will generate a graph like the following:

rbcz.png

TODO

  • get coverage to 100%
  • decide if error parsing an imap statement should be eaten, printed or an exception
  • check if it's possible to improve the parsing - there are a LOT of regexes that I throw around and it's not pretty...
  • check if anyone I know gets Czech statements, see if we can parse them too. Is there any other languages - German?
  • check if it works for non-Czech-Republic Raiffeisen