scrape-glosbe-dict

Scrape glosbe dicts given a head words file


License
MIT
Install
pip install scrape-glosbe-dict==0.1.1

Documentation

scrape-glosbe-dict

pytestpythonCode style: blackLicense: MITPyPI version

Scrape a glosbe dict

Install it

pip install scrape-glosbe-dict

# pip install git+https://github.com/ffreemt/scrape-glosbe-dict
# poetry add git+https://github.com/ffreemt/scrape-glosbe-dict
# git clone https://github.com/ffreemt/scrape-glosbe-dict && cd scrape-glosbe-dict

Use it

scrape-glosbe-dict head-word-file  # default english-chinese

# or python -m scrape_glosbe_dict head-word-file

# scrape-glosbe-dict head-word-file -f de  # german-chinese

head word file formt: one word/phrase per line, empty lines will be ignored.

output will be saved to a tsv file.

Docs

python -m scrape_glosbe_dict --help
Usage: python -m scrape_glosbe_dict [OPTIONS] head-word-file

Arguments:
  head-word-file  Head word file, one word/phrase per line, each will be used
                  to fetch corresponding definitons from https://glosbe.com/.
                  [required]

Options:
  -f, --from-lang TEXT  Source language, check https://glosbe.com/ for valid
                        value, e.g. https://glosbe.com/en/zh implies
                        from_lang='en'.  [default: en]
  -t, --to-lang TEXT    Target language, check https://glosbe.com/ for valid
                        value, e.g. https://glosbe.com/en/zh implies
                        to_lang='zh'.  [default: zh]
  -v, --verbose         Show output in the process.
  -V, --version         Show version info and exit.
  --help                Show this message and exit.

Miscellany

  • A retry mechanism (via pypi tenacity) is built-in to fetch info from glosbe. Refer to the source file for details.
  • Local cache (via pypi joblib) is used so that you can interrupt anytime and continue later.
  • Scraping is often frowned upon and sometimes can result in your IP being banned from the website. Use this package at your own discretion.