Upstream is June 5 👉 RSVP

tf-word2vec
Release 1.0.7

Word2Vec implentation with Tensorflow Estimators and Datasets

Homepage PyPI Python

Keywords: word2vec, word, embeddings, tensorflow, estimators, datasets, tensorflow-datasets, tensorflow-estimators, tensorflow2
License: MIT
Install: pip install tf-word2vec==1.0.7

Documentation

Word2Vec

This is a re-implementation of Word2Vec relying on Tensorflow Estimators and Datasets.

Works with python >= 3.6 and Tensorflow v2.0.

Install

After a git clone:

python3 setup.py install

Get data

You can download a sample of the English Wikipedia here:

wget http://129.194.21.122/~kabbach/enwiki.20190120.sample10.0.balanced.txt.7z

Train Word2Vec

w2v train \
  --data /absolute/path/to/enwiki.20190120.sample10.0.balanced.txt \
  --outputdir /absolute/path/to/word2vec/models \
  --alpha 0.025 \
  --neg 5 \
  --window 2 \
  --epochs 5 \
  --size 300 \
  --min-count 50 \
  --sample 1e-5 \
  --train-mode skipgram \
  --t-num-threads 20 \
  --p-num-threads 25 \
  --keep-checkpoint-max 3 \
  --batch 1 \
  --shuffling-buffer-size 10000 \
  --save-summary-steps 10000 \
  --save-checkpoints-steps 100000 \
  --log-step-count-steps 10000

Dependencies: 0
Dependent packages: 0
Dependent repositories: 0
Total releases: 12
Latest release: Oct 3, 2020
First release: Apr 17, 2019
Stars: 12
Forks: 3
Watchers: 1
Contributors: 3
Repository size: 5.2 MB
SourceRank: 9