InsolvencyAnnouncementsGer

InsolvencyAnnouncementsGer is a Python library for searching, viewing and scraping public announcements of German bankruptcy courts from https://www.insolvenzbekanntmachungen.de


Keywords
accounting, auditing, bankruptcy, german-bankruptcy-courts, insolvency, insolvenz, policy-making, regulatory, research, trr266
License
MIT
Install
pip install InsolvencyAnnouncementsGer==0.2.1

Documentation

Table of Content

InsolvencyAnnouncementsGer

InsolvencyAnnouncementsGer is a Python library for searching, viewing and scraping public announcements of German bankruptcy courts from https://www.insolvenzbekanntmachungen.de.

In order to comply with governmental directives as well as in order to progress with harmonising European insolvency data the online portal saw changes in 2021. As a result the released announcements of the insolvency courts on insolvency proceedings are not provided over a single source anymore. Publications on insolvency proceedings, which have been initiated in 2017 or earlier, are offered through "alt.insolvenzbekanntmachungen.de". And publications on insolvency proceedings, which have been initiated in 2018 or later, are now offered through "neu.insolvenzbekanntmachungen.de". As of now, the library continues to support "alt.insolvenzbekanntmachungen.de".

Please note: Downtime of the by the library accessed German justice portal may occur and also changes of the official register may affect the functionality of the library. The recent change of the online portal of German bankruptcy courts described above is outlined to be the first of multiple phases.

Background

The library was written by the author for research purposes of the Institute of Accounting and Auditing of the Humboldt University of Berlin and the TRR 266 Accounting for Transparency, a trans-regional Collaborative Research Center funded by the German Research Foundation (Deutsche Forschungsgemeinschaft – DFG) as part of its Open Science Data Center under the supervision of Prof. Dr. Joachim Gassen.

In this context the library's output aims to contribute to transparent research and, through the collection of field data, which is used to analyze the perception, processing and handling of accounting information, to evidence-based policy making. A use case example in education: The output of this library was used in a class project of the PhD Class VHB-ProDok "Quantitative Empirical Accounting Research and Open Science Methods" in order to derive quasi-experiments, to analyse the negative liquidity shock caused by the corona crisis and the effects of the decision of German regulators to temporarily disable the debtor's statutory obligation to file for insolvency as a measure to combat the COVID-19 headwinds.

Intended Audience: Science/Research

The library's target audience is primarily researchers. The library also intends to diminish the barriers non-German speaking researchers may face working with the German justice portal of interest.

Installation

The library is available through The Python Package Index (PyPI).

To install the library InsolvencyAnnouncementsGer through the package manager pip run:

pip install InsolvencyAnnouncementsGer

Alternatively, the library can be installed through the author's Github repository with:

pip install git+https://github.com/NDelventhal/InsolvencyAnnouncementsGer

Requirements

The following libraries are required:

  • pandas
  • requests
  • beautifulsoup4

The package manager pip can be used to install these:

pip install pandas requests beautifulsoup4

Usage

import InsolvencyAnnouncementsGer as ia
ia.regcourts_scr() 

Returns a Pandas DataFrame listing German registrations courts and corresponding register types.

ia.inscourts_scr()

Returns a Pandas DataFrame containing insolvency courts and German state abbreviations.

ia.insol_proc_scrprep()

Prepares arguments prior to the insolvency proceedings scraping. Requires user input to define search criteria and returns a Pandas DataFrame containing scraped content in case of findings. Refer to the docstring of insol_proc_scr() for more information on the output.

ia.insol_proc_scr(reg = ["HRA", "HRB"], state = "Berlin",date_from = "30.08.2020", date_to = "", name = "",
                  domicile = "", department_number = "", register_reference = "", seq_number  = "", year = "",
                  reg_court = "", reg_number = "", subject = "", search_type = "unlimited", ins_court = "",
                  scrape_html = True)

Returns scraped search results for all insolvency announcements of the register types 'HRA' and 'HRB' from the specified date ('date_from') in the form day-month-year (DD.MM.YYYY) in the state Berlin. The unlimited search (search_type = "unlimited") is limited to data released within the last two weeks. Returns search results according to entered arguments, search arguments may be defined with the help of ia.insol_proc_scrprep(). Returns Pandas DataFrame containing scraped content in case of findings. Data columns contain the scrape date, the selected registry type, the URL of the scraped announcement, the scraped hyperlink information for each observation and optional the scraped website content of the announcement.

ia.insol_proc_scrpar(df, url = "url", scraped_html= "", convert_html_to_text = True, register_type = False):

Parses the scraped insolvency proceedings announcements, the Pandas DataFrame output from insol_proc_scr() or insol_proc_scrprep(). Returns the Pandas DataFrame with appended columns listing for each announcement as variables the corresponding insolvency court, the insolvency court abbreviation, the court file number, the name or firm name of the debtor, the domicile of the debtor, the subject of the announcement, the registration court, the identified register type (optional), the register number, the German state abbreviation, the date, timestamp and the scraped_text (optional)

ia.update_url(url) 

Updates a single scraped url of an announcement, in case it turned invalid.

ia.insol_ann_state_summary(subject= "Protective measures", date_from = "24.10.2020",  date_to = "28.10.2020"):

Returns a summary overview of counts of the announcements associated with the specified subject (example: "Protective measures") by German state and register type as well as non-register linked annoucements of the specified date range.

For more details please refer to the functions' docstrings.

Support

More information on the insolvency announcement data is available under the followings links of the used data source:

Data protection and online privacy

The library scrapes data from the official register of the German justice portal. According to the German justice portal the following information on an access of the contents from insolvenzbekanntmachungen.de is stored for six weeks, before the data is made anonymous and is further solely used for statistical purposes:

  • the name of the file requested
  • the date and time of the request
  • the quantity of data transmitted
  • the error status
  • the IP address of the accessing computer

Please refer to https://alt.insolvenzbekanntmachungen.de/datenschutz.html for further information.

Roadmap

  • Development (Q3 2020)
  • Add library to The Python Package Index (PyPI) (Q4 2020)

Please also check the open issues for other proposed features.

Contributing

Contributions are welcome. Please do not hesitate to open an issue or pull request. Prior to pull requests containing major changes, please communicate these changes via a new issue. Please try to avoid duplicates and have a look at the open issues for a list of known issues and proposed changes prior to it.

Acknowledgments

I would like to thank Prof. Dr. Joachim Gassen for his supervision of this library during its development and for his contribution to the code.

See the list of contributors who participated in this project here.

License

This project is licensed under the MIT License.

Contacts

  • The author: Niall Delventhal - ni.delventhal@gmail.com

  • Institute of Accounting and Auditing, School of Business and Economics - Humboldt-Universität zu Berlin: wpruefung@wiwi.hu-berlin.de

  • Project Link Accounting for Transparency: Find the contact details of the TRR 266‘s participating institutions here