BayesSets

Implementation of the Bayesian Sets algorithm and variants


Keywords
bayes, set, expansion, machine, learning, information, retrieval
License
MIT
Install
pip install BayesSets==0.2.1

Documentation

Bayesian Sets

Implementation of the Bayesian Sets algorithm and variants. Bayesian Sets is a set expansion algorithm for generic datasets. The original paper [1] proposes a model for the exponential family of distributions, and with greater detail a model using the independent Bernoulli distribution.

The repository is available on PyPI and can be installed with pip install bayessets. The complete documentation can be found at ReadTheDocs.

Basic usage

After you create your model, you can use the query method, passing the index of your seed set, to get the score vector for every instance. If you want the index of the most likely expansion, use numpy.argsort and reverse it, as shown below.

import bayessets
import numpy as np

model = bayessets.BernoulliBayesianSet(mybinarymatrix)
myquery = [23, 50, 78] # 0-based index of seed set
scores = model.query(myquery)
ranking = np.argsort(scores)[::-1]
top20 = ranking[:20]

References

[1] Ghahramani, Zoubin, and Katherine A. Heller. "Bayesian sets." Advances in neural information processing systems (2006): 435-442.