This little tool will fetch and display real-time departure times for VBB/BVG public transport lines for a single stop in Berlin and Brandenburg, Germany. Here, VBB is the "Verkehrsverbund Berlin-Brandenburg" and BVG is the "Berliner Verkehrsbetriebe".
This tool was partly developed as an instructive example (although a toy one) of using Pandas and producing some output to be fed into a web-based dashboard like those one can create with Dashing (to be done).
Here is an example output for "Möckernbrücke" (including the optional
on the command-line:
$ vbbvg --stop Möckernbrücke --header Now: 14:06:04 Stop-Name: U Möckernbrücke (Berlin) Stop-ID: 9017104 Wait Departure Line Destination ------ ----------- ------ ---------------------------- 00:48 14:07 U1 U Uhlandstr. (Berlin) 02:48 14:09 U7 S+U Rathaus Spandau (Berlin) 02:48 14:09 U1 S+U Warschauer Str. (Berlin) 04:48 14:11 U7 U Rudow (Berlin)
This shows the waiting and departure times (in MM:SS and HH:MM format,
respectively) from one stop, limited to all earliest unique combinations of
the Line and Destination columns.
This is usually the only information one is interested in just "before
leaving the office" as a typical use-case.
This tool filters these combinations, calculates the waiting times and inserts
them as the first column in the output.
There are quite a few other command-line options which you can find out more
about by typing
python vbbvg -h.
Installation and Test
You can clone the repository and install it via pip. After
installing, you will have access system wide (or in your virtualenv)
vbbvg programmatically or via the CLI.
pip install -e .
There is a list of dependencies in the file
(for more about them please read the next section) which you can install
with the command
pip install -r requirements.txt.
To run the little "test suite", download and unpack this repository or
clone it, and run the command
python setup.py test in the unpacked archive.
Of course this needs the
pytest package to be available (not listed in
the requirements, but easy to install with
pip install pytest).
This tool was partly developed as an instructive example of using Pandas for
greatly simplifying the code needed for working with tables (
html5lib packages are optional dependencies,
but needed for
pandas.read_html() which will barf if they are not
Output on the command-line is created by using the
tabulate packages, saving a great amount of code to write otherwise
System Requirements (Linux)
As a Linux user pointed out: if you are on Linux you might have to install the following packages manually:
sudo apt-get install libxml2-dev libxslt1-dev
And if you run into an
/usr/bin/ld: cannot find -lz error consider
installing this one before running
sudo apt-get install lib32z1-dev
Since VBB/BVG have no API for real time data access this data is fetched (scraped using Pandas, yes!) from a web application on http://mobil.bvg.de. You can use this page for testing manually (in English). There, as a real person, you can enter parts of the destination name and get a list of matching destinations to chose from, before you get to see the one result table you are interested in.
To avoid multi-level scraping, speed things up, and add some more thrills,
a small part of an existing "Open Data" VBB database, published under the
CC-BY 3.0 license
is used to access the stop names and IDs of the VBB/BVG public transport
network (a simple CSV file named here
The resulting tables are output as "real" tables in various formats on the command-line, see usage examples below.
You can run a few sample command-line calls using the options
--stop <NAME_ID> for a given stop name or ID like this for the
$ vbbvg --test --stop Möckernbrücke
The main function to use programmatically is
which returns a Pandas
DataFrame object, which you can convert to almost
anything you like. See the following examples:
Get departures of S7 and S75 from Berlin main station:
In : import vbbvg In : df = vbbvg.get_next_departures('9003201', filter_line='S7') In : df.columns Out: Index([u'Wait', u'Departure', u'Line', u'Destination'], dtype='object') In : list(df.to_records()) Out: [(1, '00:00', u'10:01', u'S75 (Gl. 16)', u'S Westkreuz (Berlin)'), (4, '01:10', u'10:03', u'S75 (Gl. 15)', u'S Wartenberg (Berlin)'), (14, '04:10', u'10:06', u'S7 (Gl. 16)', u'S Potsdam Hauptbahnhof'), (24, '07:10', u'10:09', u'S7 (Gl. 15)', u'S Ahrensfelde Bhf (Berlin)'), (62, '21:10', u'10:23', u'S75 (Gl. 15)', u'S Ostbahnhof (Berlin)')] In : print(df.to_csv()) ,Wait,Departure,Line,Destination 1,00:00,10:01,S75 (Gl. 16),S Westkreuz (Berlin) 4,01:10,10:03,S75 (Gl. 15),S Wartenberg (Berlin) 14,04:10,10:06,S7 (Gl. 16),S Potsdam Hauptbahnhof 24,07:10,10:09,S7 (Gl. 15),S Ahrensfelde Bhf (Berlin) 62,21:10,10:23,S75 (Gl. 15),S Ostbahnhof (Berlin)
When using this tool inside some kind of web-based dashboard like those
created by Dashing (which was the originally intended use-case)
one should use a stop's ID to be sure to provide a unique stop on the
VBB/BVG public transport network. You can find out the IDs by running
test queries with the
- mention http://fahrinfo.vbb.de/bin/stboard.exe/en? (provides some more filtering features)
- add more examples in the Usage section above
- make the code polyglot, running not only on Python 2.7 but also 3.4/3.5
- test option to filter specific line types like S-Bahn ('S.*') or single lines ('U7')
- use in some real dashboard like those of dhasing.io (the original purpose!)
- mention that case is ignored in the whole tool for all stop names
- store the last displayed stop (in ~/.vvbvg or so) and reuse when called without any args/options
- remove index numbers (leftmost column) from result tables when used programmatically
Due to time limitations any help is welcome with any of the items above.