lawfactory-utils

Python utils for The Law Factory parsers


Keywords
scraping, politics, data, lafabriquedelaloi
License
MIT
Install
pip install lawfactory-utils==0.2.2

Documentation

lawfactory-utils

https://travis-ci.org/regardscitoyens/lawfactory_utils.svg?branch=master

A few utilities for the-law-factory-parser project, shared by senapy and anpy.

  • A simple caching library:
from lawfactory_utils.urls import enable_requests_cache, download
enable_requests_cache()

.....

resp = download(url)
print(resp.text)

Warning: To be able to download from Légifrance, you must set up a LEGIFRANCE_PROXY env variable, which is a running instance of legifrance-proxy.

The cached responses are stored in the directory where this lib is installed. You can use lawfactory_where_is_my_cache to print the path.

  • URL cleaning for senat/AN/legifrance/conseil-constit
>>> from lawfactory_utils.urls import clean_url
>>> clean_url('https://www.legifrance.gouv.fr/eli/loi/2017/9/15/JUSC1715752L/jo/texte')
'https://www.legifrance.gouv.fr/jorf/id/JORFTEXT000035567936'
  • Parsing of National Assembly URLS
>>> from lawfactory_utils.urls import parse_national_assembly
>>> parse_national_assembly_url("http://www.assemblee-nationale.fr/dyn/15/dossiers/retablissement_confiance_action_publique")
(15, 'retablissement_confiance_action_publique')