pydecode 0.1.14 on PyPI

PyDecode is a dynamic programming toolkit developed for research in natural langauge processing. Its aim is to be simple enough for fast prototyping, but efficient enough for research use.

Features

Simple specifications. Dynamic programming algorithms specified through pseudo-code.

# Viterbi algorithm.
...
c.init(items[0, :])
for i in range(1, n):
    for t in range(len(tags)):
        c.set(items[i, t],
              items[i-1, :],
              labels=labels[i, t, :])
graph = c.finish()

Efficient implementation. Core code in C++, python interfaces through numpy.

# Compute path.
label_weights = numpy.random.random(graph.label_size)
weights = pydecode.transform_label_array(graph, label_weights)
path = pydecode.best_path(graph, weights)

High-level algorithms. Includes a set of widely-used algorithms.

# Inside probabilities.
inside = pydecode.inside(graph, weights, kind=pydecode.LogProb)

# (Max)-marginals.
marginals = pydecode.marginals(graph, weights)

# Pruning
mask = marginals > threshold
pruned_graph = pydecode.filter(graph, mask)

Integration with machine learning toolkits. Train structured models.

# Train a discriminative tagger.
perceptron_tagger = StructuredPerceptron(tagger)
perceptron_tagger.fit(X, Y)
Y_test = perceptron_tagger.predict(X_test)