Learning node embeddings via Theano (the GPU if you have a good one), with minibatch and AdaGrad.
Install via running
python setup.py install
pip install graph2vec
Data should be in space delimited files describing edges, either as
from_node to_node distance or without the path distance. For example, in a text file like
0 1 1 3 14 21 21 1
Use the helper
trainer module to build/load the graph.
import graph2vec.trainer graph2vec = Graph2Vec(vector_dimensions=128) graph2vec.parse_graph('edge.data', extend_paths=2)
extend_paths argument dictates the length of paths included in the cost function. Note that only the shortest path between nodes are included. Fit the vectors with
Don't go too crazy with the batch size, you'll get a speed up but convergence will be erratic. The vectors, both as origin and destination nodes are stored in the model object.