Deep Learning Toy

Lightweight deep learning library implemented in Python. Designed for studying how contemporary deep learning libraries are implemented.

Architecture

There are several core ideas used by the framework: computational graph, forward propagation, loss/cost function, gradient descent, and backward propagation. Computational graph is a graph representing ordered set of primitive algeabric operations. Forward propagation feeds an input into a computational graph and produces the output. Loss function is a metric measuring how well a model estimates class or a value based on the input; usually, a loss function produces a scalar value. Gradient descent is the calculus approach for a loss function minimization. It uses the simple idea that in order to minimize a function we have to follow a path directed by its variables gradients. Backward propagation takes a graph in the state after forward propagation had finished, and calculates gradients starting from the output towards the input; this direction from the head of the computational graph towards the tail is the result of the calculus chain rule.

Computational Graph

ComputationalGraph class is equipped with methods representing primitive algeabric operations. Each method takes an input and produces an output. Inputs and outputs are represented by the Connection class, and operations by the Node class. There are two types of connections: constants and variables. The former do not change during the model optimization, but the latter could be changed during the optimization process. Here is the example of the primitive computational graph which adds two numbers:

from pydeeptoy.computational_graph import *

cg = ComputationalGraph()
sum_result = cg.sum(cg.constant(1), cg.constant(2))

The code listed above builds the computational graph, but doesn't execute it. In order to execute the graph the SimulationContext class should be used. The simulation context has the logic for doing forward/backward propagation. In addition, it stores all computation results produced by each and every operation, including gradients obtained during the backward phase. The code executing the computational graph described above:

from pydeeptoy.computational_graph import *
from pydeeptoy.simulation import *

cg = ComputationalGraph()
sum_result = cg.sum(cg.constant(1), cg.constant(2))

ctx = SimulationContext()
ctx.forward(cg)

print("1+2={}".format(ctx[sum_result].value))

Atomic Operations

A computational graph is composed from a set of operations. An operation is the minimum building block of a computational graph. In the framework an operation is represented by the abstract Node class. All operation take an input in the form of a numpy array or a scalar value and produce either a scalar value or a numpy array. In other words, a computational graph passes a tensor through itself. That is why one of the most popular deep learning framework is called TensorFlow. The following operations are implemented in the computational_graph module:

Operation	Description
sum	Computes the sum of two tensors.
multiply	Computes the product of two tensors.
matrix_multiply	Computes the product of two matrices (aka 2 dimensional tensors).
div	Divides one tensor by another.
exp	Calculate the exponential of all elements in the input tensor.
log	Natural logarithm, element-wise.
reduce_sum	Computes the sum of elements across dimensions of a tensor.
max	Element-wise maximum of tensor elements.
broadcast
transpose	Permute the dimensions of a tensor.
reshape	Gives a new shape to an array without changing its data.
conv2d	Computes a 2-D convolution given 4-D input and filter tensors.

Activation Functions

Activation functions are used for thresholding a single neuron output. First, a neuron calculates its output based on the weighted sum of its inputs. Second, the calculated weighted sum is fed into the activation function. Finally, the activation function produces the final neuron output. Usually, an activation function ouput is normalized to be in between 0 and 1, or -1 and 1. The list of implemented activation functions:

Relu

Loss Functions

Loss functions are used as a mesure of the model performance. Usually, it is just a scalar value telling how well a model estimates output based on the input. Needless to say, a universal loss function which fits all model flavours doesn't exists. The following loss functions are implemented in the losses module:

Usage Examples

The set of primitive building blocks provided by the framework could be used to build robust estimators. The benefit of using the framework is that you do not have to implement forward/backward propagation from scratch for every kind of an estimator.

	Iris	MNIST	CIFAR-10
Support Vector Machine (SVM)	Example
Multilayer Perceptron	Example	Example

License

MIT license

pydeeptoy
Release 1.0.0.7

Release 1.0.0.7

1.0.0.7

1.0.0.6

1.0.0.5

1.0.0.4

1.0.0.3

1.0.0.2

1.0.0.1

1.0.0.0

Documentation

Deep Learning Toy

Architecture

Computational Graph

Atomic Operations

Activation Functions

Loss Functions

Usage Examples

License

Stats

Development practices

Releases

Contributors

pydeeptoy Release 1.0.0.7

Release 1.0.0.7 Toggle Dropdown 1.0.0.7 1.0.0.6 1.0.0.5 1.0.0.4 1.0.0.3 1.0.0.2 1.0.0.1 1.0.0.0

Documentation

Deep Learning Toy

Architecture

Computational Graph

Atomic Operations

Activation Functions

Loss Functions

Usage Examples

License

Stats

Development practices

Releases

Contributors

pydeeptoy
Release 1.0.0.7

Release 1.0.0.7

1.0.0.7

1.0.0.6

1.0.0.5

1.0.0.4

1.0.0.3

1.0.0.2

1.0.0.1

1.0.0.0