Chainer深度学习框架介绍

Chainer是深度学习的框架,Chainer在深度学习的理论算法和实际应用之间架起一座桥梁。它的特点是强大、灵活、直观,被认为是深度学习的灵活框架。
网站: http://chainer.org/

看一下文档:
Chainer介绍
现存框架和为什么发展Chainer

前向和后向计算简单例子
梯度计算
Chains构成
参数优化
链接和优化的序列化
读了此段后,你可: 计算一些计算梯度,写一个多层感知器

核心概念
Chainer是一个灵活的神经网络框架,使写复杂架构简单和直观。
大部分现存深度学习框架基于“定义和运行”方案,首先一个网络定义和固定,然后用户定期小批量学习。
由于网络静态定义在前向/后向计算之前,所有的逻辑必须嵌入到数据网络架构。结果定义一个网络架构在这样系统(如Caffer)跟随声明方法。
注意到你可以用命令式语言产生一个静态的网络定义 (e.g. torch.nn, Theano-based 框架, and TensorFlow).

对比而言,Chainer采用了一类定义运行机制,网络通过前向计算透明定义。更精确的是,Chainer存储计算历史代替编程逻辑。
这个策略能够充分利用Python中编程逻辑的能量。例如,Chainer不需要任何魔法来将条件和循环引入网络定义。定义运行机制是
Chainer核心机制。 We will show in this tutorial how to define networks dynamically.

This strategy also makes it easy to write multi-GPU parallelization, since logic comes closer to network manipulation. We will review such amenities in later sections of this tutorial.

Forward/Backward Computation

As described above, Chainer uses “Define-by-Run” scheme, so forward computation itself defines the network. In order to start forward computation, we have to set the input array to Variable object. Here we start with simple ndarray with only one element:

>>> x_data = np.array([5], dtype=np.float32)
>>> x = Variable(x_data)

A Variable object has basic arithmetic operators. In order to compute y=x 2 −2x+1
y=x2−2x+1
, just write:

>>> y = x**2 – 2 * x + 1

The resulting y is also a Variable object, whose value can be extracted by accessing the data attribute:

>>> y.data
array([ 16.], dtype=float32)

What y holds is not only the result value. It also holds the history of computation (or computational graph), which enables us to compute its differentiation. This is done by calling its backward() method:

>>> y.backward()

This runs error backpropagation (a.k.a. backprop or reverse-mode automatic differentiation). Then, the gradient is computed and stored in the grad attribute of the input variable x:

>>> x.grad
array([ 8.], dtype=float32)

Also we can compute gradients of intermediate variables. Note that Chainer, by default, releases the gradient arrays of intermediate variables for memory efficiency. In order to preserve gradient information, pass the retain_grad argument to the backward method:

>>> z = 2*x
>>> y = x**2 – z + 1
>>> y.backward(retain_grad=True)
>>> z.grad
array([-1.], dtype=float32)

All these computations are easily generalized to multi-element array input. Note that if we want to start backward computation from a variable holding a multi-element array, we must set the initial error manually. This is simply done by setting the grad attribute of the output variable:

>>> x = Variable(np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32))
>>> y = x**2 – 2*x + 1
>>> y.grad = np.ones((2, 3), dtype=np.float32)
>>> y.backward()
>>> x.grad
array([[ 0., 2., 4.],
[ 6., 8., 10.]], dtype=float32)

Note

Many functions taking Variable object(s) are defined in the functions module. You can combine them to realize complicated functions with automatic backward computation.

发表评论