DeepNeighbor
Embedding-based Retrieval for ANN Search and Recommendations!
View Demo
·
Report Bug
·
Request Feature
Install
pip install deepneighbor
How To Use
from deepneighbor import Embed
model = Embed(data_path, model='gat')
model.train() # see optional parameters below
model.search(seed = 'Louis', k=10) # ANN search
embedings = model.get_embeddings() # dictionary. key: node; value: n-dim node embedding
Input format
The input data for the Embed() should be a (*.csv or *.txt ) file path (e.g. '\data\data.csv')with two columns in order: 'user' and 'item'. For each user, the item are recommended to be ordered by time.
Models & parameters in Embed()
-
Word2Vec
w2v
-
Graph attention network
gat
-
Factorization Machines
fm
- Deep Semantic Similarity Model
- Siamese Network with triple loss
- Deepwalk
- Graph convolutional network
-
Neural Graph Collaborative Filtering algorithm
ngcf
-
Matrix factorization
mf
Model Parameters
word2vec
model = Embed(data, model = 'w2v')
model.train(window_size=5,
workers=1,
iter=1
dimensions=128)
-
window_size
Skip-gram window size. -
workers
Use these many worker threads to train the model (=faster training with multicore machines). -
iter
Number of iterations (epochs) over the corpus. -
dimensions
Dimensions for the node embeddings
graph attention network
model = Embed(data, model = 'gat')
model.train(window_size=5,
learning_rate=0.01,
epochs = 10,
dimensions = 128,
num_of_walks=80,
beta=0.5,
gamma=0.5,)
-
window_size
Skip-gram window size. -
learning_rate
learning rate for optimizing graph attention network -
epochs
Number of gradient descent iterations. -
dimensions
Dimensions for the embeddings for each node (user/item) -
num_of_walks
Number of random walks. -
beta
andgamma
Regularization parameter.
How To Search
model.search(seed, k)
-
seed
The Driver for the algorithms -
k
Number of Nearest Neighbors.
Examples
Open Colab to run the example with facebook data.
License
This project is under MIT License, please see here for details.