tensorflow1.xでNeural Network3 – 怠惰人間の情報系ブログ

今回は、前回の記事の補足としてのものです。前回の記事で書いていなかったミニバッチ処理と言う手法について取り上げていきます。

ミニバッチ処理

Neural Networkの問題として、学習結果が一定の精度(自分の目的の精度に満たないぐらい)で安定してしまうというものがあります。この問題を解決するために生まれたのがミニバッチ処理と言うものです。ミニバッチ処理は、1回の学習で１つの入力と出力の対を学習するのではなく、複数の入出力を1回の学習内で学びます。そのようにすることで、学習結果がある程度のところで安定してしまうことを防げます。この現象は、詳しくは説明しませんがNeural Networkの学習で降下法というものを使用しているために発生します。また、ミニバッチ法を使用することで、学習結果が良くなるまでの時間が遅くなるというデメリットも存在します。では、実際の数式上で多数の入出力を学習するためにはどうすればいいのかをみていきましょう。

今までのNeural Networkの数式は以下のようなものでした。

$$
y=x\cdot\omega+b
$$

多数の入出力の対を学習する場合には、$x$、$y$に行列を入力するだけです。数式を行列やベクトルに直して見てみるとイメージがつくと思います。

$$
y
=
( x_1,x_2 )
\cdot
\left(
\begin{array}{c}
\omega_1 \\
\omega_2 \\
\end{array}
\right)
+
b
$$

から

$$
\left(
\begin{array}{c}
y_1 \\
\vdots \\
y_n
\end{array}
\right)
=
\begin{eqnarray}
\left(
\begin{array}{cc}
x_{1,1} & x_{2,1} \\
\vdots & \vdots \\
x_{1,n} & x_{2,n}
\end{array}
\right)
\end{eqnarray}
\cdot
\left(
\begin{array}{c}
\omega_1 \\
\omega_2 \\
\end{array}
\right)
+
\left(
\begin{array}{c}
b_1 \\
\vdots \\
b_n
\end{array}
\right)
$$

となります。とても簡単ですね。

また、二乗誤差も１つ用の数式から複数個用の数式に変更しましょう。これは、総和をとるだけなのでとても簡単です。

$$
l=(y_n-t_n)^2
$$

から

$$
l=\frac{ 1 }{ 2 }\sum_{ i = 0 }^{ n } (y_n-t_n)^2
$$

となります。また、0.5倍しているのは、数が大きくなりすぎるのを防止しているためです。なくても数式が行いたいことに影響はありませんが、オーバーフローなどの可能性が発生します。また、実際の予想結果や教師データはベクトルで出力されますが、そのようなベクトルデータを対象に、要素の総和をとる際にはtensorflowの

ベクトル要素の総和：tensorflow.reduce_sum(ベクトル)

と言う関数を使用します。

では実際に３個同時に学習するようにプログラムで実装して確認をしてみましょう。

#インポート
import tensorflow as tf
import random

#式の定義

# 重みの定義
#  tf.zeros(shape,型):shapeサイズの0行列を定義
weight=tf.Variable(tf.zeros([1,2],tf.float32))

# バイアスの定義
b=tf.Variable(tf.zeros([1,1],tf.float32))

# 入力の定義
x=tf.placeholder(tf.float32,(2,None))

# 教師データの定義
t=tf.placeholder(tf.float32,(1,None))

# 式の設定
y=tf.add(tf.matmul(weight,x),b)

# 誤差関数の定義(二乗誤差)
#  tf.square(n)=ベクトルnの要素の二乗を行う
#  tf.reduce_sum(n)=ベクトルnの要素の総和
loss=0.5*tf.reduce_sum(tf.square(y-t))

# 学習方法の定義(SGD)
train=tf.train.GradientDescentOptimizer(0.01).minimize(loss)

#式の定義終了

#セッションの開始
with tf.Session() as s:
    s.run(tf.global_variables_initializer())    #変数の初期化 (重要！)

    for i in range(0,1000,1): #1000回学習を行う
        
        #教師データの生成
        x_1=[]
        x_2=[]
        for k in range(0,3,1):
            x_1+=[random.uniform(0,10)]
            x_2+=[random.uniform(0,10)]
            
        train_x=[x_1,x_2]
        train_y=[[]]
        for k in range(0,3,1):
            noise=random.uniform(-10,10)
            train_y[0]+=[5.0*x_1[k]+2.0*x_2[k]+10.0+noise]

        #重みとバイアスの遷移を確認
        if i%100==0:
            print("Step:",i)
            print("\tweight=",s.run(weight))
            print("\tbias=",s.run(b))

        #学習の実行
        s.run(train,feed_dict={x:train_x,t:train_y})

    #結果の出力
    print("Step:",i+1)
    print("\tweight=",s.run(weight))
    print("\tbias=",s.run(b))

<実行結果>
 Step: 0
         weight= [[0. 0.]]
         bias= [[0.]]
 Step: 100
         weight= [[4.372695 2.154573]]
         bias= [[3.9028687]]
 Step: 200
         weight= [[5.680234  2.6559684]]
         bias= [[6.2194257]]
 Step: 300
         weight= [[5.169346  2.2691214]]
         bias= [[6.910868]]
 Step: 400
         weight= [[5.9430413 2.2022545]]
         bias= [[6.5457535]]
 Step: 500
         weight= [[3.9281194 3.6536412]]
         bias= [[9.528699]]
 Step: 600
         weight= [[5.1522303 3.0777059]]
         bias= [[9.2687235]]
 Step: 700
         weight= [[6.752254  3.3597639]]
         bias= [[11.156225]]
 Step: 800
         weight= [[5.7685375 3.1831393]]
         bias= [[10.863632]]
 Step: 900
         weight= [[4.744218  0.9056735]]
         bias= [[10.15814]]
 Step: 1000
         weight= [[4.489449  0.2827294]]
         bias= [[10.346549]]

どうだったでしょうか。実行結果が1つずつ学習した時とあまり変わっていないことがわかります。

また、上記のプログラムでデータを4個以上使用した学習を行う場合には、オーバーフロー対策のために、学習率を小さい値にしてください。また、学習率を下げた場合には、学習回数を増加する必要があります。

他に重要な点として、ミニバッチ法を既存のデータを用いて行う場合では学習データをランダムにn枚取り出し、1回学習を行います。学習を1回行った後には、再度データをランダムに取り出し、学習します。またこのランダムに取り出すサイズのことをバッチサイズ(batch size)と言います。

epoch

最後に、ミニバッチ処理の特有の呼び方を紹介します。ミニバッチを使用する際には、通常の学習回数(イテレータ、ステップなどという)とは別にEpoch数というものが導入されます。Epoch数は、ミニバッチ法を事前に用意したデータで行う際に使用されます。Epoch数の定義は、今までの学習に使われたデータの個数でデータの総数を割った数となります。式で表すなら

$$
\displaystyle \frac{ データの総数 }{ 今までの学習に使用したデータ数 }
$$

このような式で求められます。いくつかのソフトウェアでは、このEpoch単位で学習回数を定めることがあります。ただし、Epochでは少数点を普通使用しません。

今日の内容は以上！

まとめ

ミニバッチ法：1回の学習で複数の入出力を使用する方法
ミニバッチ法の実装法：入力と出力を行列にするだけ
ミニバッチ法の利点：学習結果が一定の値で止まってしまうことを防げる
ベクトル要素の総和：tensorflow.reduce_sum(ベクトル)
ミニバッチ法を事前に用意したデータでやる場合にはn個のデータを学習枚にランダムに取り出す
ミニバッチ法を事前に用意したデータでやる場合、ランダムに取り出した枚数nのことをバッチサイズという
ミニバッチ法を事前に用意したデータでやる場合、Epoch数という数が定義される。これは今までの学習に使われたデータの個数でデータの総数を割った数のこと
Epoch数には少数点は通常使用しない

Cookie	期間	説明
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	期間	説明
__gads	1 year 24 days	This cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_ga_4K4VZRQYFW	2 years	This cookie is installed by Google Analytics.
_ga_RGFCBJ0MLC	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_204837783_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.

ミニバッチ処理

epoch

まとめ

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル