Tensorflow2.XのCustomLayerでのbatch sizeの扱い方

(2021/08/23追記): 本ページに記載していたプログラムで，batch sizeが正常に取得できていないことが分かったため，内容を大幅に修正しました．

こんにちは．怠惰人間です．

この前，TensorflowでCustom layerを作成していたところ，batch sizeの扱いに困ったので，扱い方をまとめたものを示します．

ホームページ末尾にこのページのプログラムをまとめたGoogle Colabへのリンクを用意しておくので，自分のプログラムに応用できるようにいじるなどして，ぜひ活用してください．

今回の内容はこんな感じになってます．

Custom layerについて
Custom layerにおけるBatch Sizeの扱われ方
ループに使用する場合
tf.scan()
Google Colab

1. Custom layerについて

このページで言うCustom layerとは，

tensorflow.keras.layers.Layer

を継承したクラスのことです．このクラスを継承することで，tf.keras API上のFunctionalやSequentialで使用可能なレイヤを作成することができます．

tensorflow公式のチュートリアルでは，このCustom Layerを使用してResNetのResBlockを作り，SequentialでResNetを利用可能にしていたりしていて，かなり便利なものとなっています．

簡単に説明すると，Custom layerは

__init__(): オプションなどの設定
build(): 学習用の変数の宣言など
call(): 実際の処理が書かれているところ

の3つにわけることができます．今回は，call()しか使う予定はないので，他のものは忘れてしまって大丈夫です．

call()関数では，実際の計算の前にテンソル形状のみで計算を一通り行い，不可能な形状の計算(e.g (2,3)の行列と(10,23)の行列の積)が無いかを確認します．

そのため，このページでのお話は，ほぼ全て形状の話で行われます．なお，tensorflow2系では，形状はtf.TensorShapeというクラスで扱われますが，基本的にリストとして認識してくれれば問題ないです．

2. Custom layerにおけるBatch Sizeの扱われ方

Custom layerでの，Batch Sizeは通常の方法ではNoneとして扱われます．

つまり，tf.TensorShape((100,100,3))の次元の画像をBatch処理する際には，画像を格納している変数はtf.TensorShape((None,100,100,3))として扱われるということです．

このことは，以下のプログラムで確認できます．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    raise ValueError(input.get_shape())
    #-> ValueError: (None, 1)

m=tf.keras.Sequential(My_Layer(input_shape=(1,)))

また，Noneとして扱われるという特徴から，多くの関数で使用することができません．

例えば，Batch Size+1を戻り値とするレイヤを作成しようとして，このようにすると，失敗します．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    batch_size=input.get_shape()[0]
    tf.print("In My_Layer.BatchSize:",batch_size)
    return tf.add(batch_size,1)

m=tf.keras.Sequential(My_Layer(input_shape=(1,)))

print("Output:",m.predict([0,0,0],batch_size=3))

このプログラムの実行結果は

ValueError: Tried to convert 'x' to a tensor and failed. 
Error: None values not supported.

のようになります．引数左側(x)にNoneがあって計算できねぇよ！って怒られているわけです．

このような調子で，多くの関数で直接使用することはできません．

しかし，tf.shape関数を使用することで，これが使用可能になります．

実際にtf.shape関数を使用して先ほどの加算のプログラムを変更すると，

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    batch_size=tf.shape(input)[0]
    tf.print("In My_Layer.BatchSize:",batch_size)
    return tf.add(batch_size,1)

m=tf.keras.Sequential(My_Layer(input_shape=(1,)))

print("Output:",m.predict([0,0,0],batch_size=3))

のようになり，動作結果は

In My_Layer.BatchSize: 3
Output: 4

となり，動作していることが分かります．

3. ループに使用する場合

Batch Sizeをループに絡める際には，3つの方法があります．

それぞれ

for
while
tf.while_loop

です．それぞれを以下で解説していきます．

for

for文を使用する場合には，tf.range()関数を使用する必要があります．

この関数は，そのままpythonの組み込み関数であるrange()をtensorflow versionにしたような関数です．

tf.range()はほぼrange()を使用している気分で使えますが，注意点として，ループ外で定義された変数の形状がループ内で変更される際には，tf.autograph.experimental.set_loop_options()を設定しなくてはいけません．

実際のプログラム例は以下のようになります．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    batch_size=tf.shape(input)[0]

    result1=tf.constant([0])
    result2=tf.constant([0])

    #形状の変形が無い場合
    for i in tf.range(batch_size):
      result1+=1

    #形状の変形がある場合
    for i in tf.range(1,batch_size):
      tf.autograph.experimental.set_loop_options(
          shape_invariants=[(result2,tf.TensorShape([None]))]
      )
      result2=tf.concat([result2,tf.expand_dims(i,axis=0)],axis=0)

    return tf.concat([result1,result2],axis=0)


m=tf.keras.Sequential(My_Layer(input_shape=(1,)))


data=[0,1,2,3]
m.predict(data,batch_size=4)

出力は以下のようになります．

array([4, 0, 1, 2, 3], dtype=int32)

ちなみに，tf.autograph.experimental.set_loop_optionsを削除して実行すると

 ValueError: 'result2' has shape (1,) before the loop, but shape (2,) after one iteration. Use tf.autograph.experimental.set_loop_options to set shape invariants.

というエラーが出ます．エラー文でも tf.autograph.experimental.set_loop_optionsを使えと言われていますね．

while

while文ではwhile前と後で変数の形状が変わる際に tf.autograph.experimental.set_loop_options() を使用するということと，無限ループを使用する際に

while True:

ではなく，

while tf.constant(True):

とすることにさえ気を付けておけばなんの問題もありません．

テストプログラムは以下のようになります．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    batch_size=tf.shape(input)[0]

    result1=tf.constant([0])
    result2=tf.constant([0])

    #形状が変化しない場合
    i=0
    while tf.constant(True): #"while True:"ではエラー
      result1+=1

      i+=1
      if batch_size<=i:
        break

    #形状が変化する場合
    i=tf.constant(1)
    while batch_size>i:
      tf.autograph.experimental.set_loop_options(
          shape_invariants=[(result2,tf.TensorShape([None]))]
      )
      result2=tf.concat([result2,tf.expand_dims(i,axis=0)],axis=0)

      i+=1

    return tf.concat([result1,result2],axis=0)

m=tf.keras.Sequential(My_Layer(input_shape=(1,)))

data=[0,1,2,3,4]
m.predict(data,batch_size=5)

結果は

array([5, 0, 1, 2, 3, 4], dtype=int32)

のようになります．

なお，無限ループでTrueを使用すると

 OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

というエラーが出力されます．Pool型は使用できないとエラーに書かれていますね．

tf.while_loop

最後はtf.while_loopです．

この関数はtensorflow 1.X系から存在する関数で，2.X系ではあまり使う機会がないかもしれません．

この関数は少々特殊で，引数に

cond
body
loop_vars

の3つの引数を取ります．これらの引数は，Pythonのコードで書くと

def while_loop():
  def cond(args: list) -> bool:
    ...

  def body(args: list) -> list:
    ...
    assert dst.shape==args.shape
    return dst

  loop_vars=[x, y, ..., z]

  temp=loop_vars

  while cond(temp):
    temp=body(temp)
  
  return temp

このような形で動作する関数になります．(上記の内，cond, body, loop_varsは実際には引数で渡される)

また，この関数内でのループにおいて，loop_varsに定義された変数の形状が変化する場合には， tf.autograph.experimental.set_loop_options() ではなく，引数の「shape_invariants」を使用します．

テスト用コードは以下になります．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    batch_size=tf.shape(input)[0]

    result1=tf.constant([0])
    result2=tf.constant([0])

    #形状が変化しない場合
    i=tf.constant(0)
    i,result1=tf.while_loop(
        lambda i,a: tf.less(i,batch_size),  # もし，i<batch_sizeならば
        lambda i,a: (tf.add(i,1),a+1),      # iには1を足し，a(result1)には1を足す
        [i,result1]                         # ループに使用する変数はiとresult1
    )

    i=tf.constant(1)
    i,result2=tf.while_loop(
        lambda i,a: tf.less(i,batch_size),                                # もし，i<batch_sizeならば
        lambda i,a: (i+1,tf.concat([a,tf.expand_dims(i,axis=0)],axis=0)), # iには1を足し，a(result2)にはiを連結する
        [i,result2],                                                      # ループに使用する変数はiとresult2
        shape_invariants=[i.get_shape(),tf.TensorShape([None])]
    )
    return tf.concat([result1,result2],axis=0)


m=tf.keras.Sequential(My_Layer(input_shape=(1,)))


data=[0,1,2,3,4,5,6]
m.predict(data,batch_size=7)

また，このコードの実行結果は

array([7, 0, 1, 2, 3, 4, 5, 6], dtype=int32)

のようになります．

また，ループで形状が変化するのにshape_invariantsを指定しない場合には

ValueError: Input tensor 'my__layer_65/Const_1:0' enters the loop with shape (1,), but has shape (2,) after one iteration. To allow the shape to vary across iterations, use the `shape_invariants` argument of tf.while_loop to specify a less-specific shape.

というエラーが出ます．エラーでもshape_invariantsを定義しろと言われていますね．

4. tf.scan()

tf.range()はBatch Sizeまでの数のリストを生成したのに対し，tf.scan()はBatch Sizeを持つ変数に直接アクセスする関数です．イメージとしては，

tf.range(0,tensor): for i in range(0,len(tensor))
tf.scan(): for element in tensor

みたいな感じですね．

この関数はこのような感じで動作します．

def scan(fn,elems):
  """
  fn: 関数のイメージ
  elems: tensor(どのような階数かは気にしない)
  """
  sum=0
  for elem in elems:
    sum=fn(sum,elem)

　return sum

なんとなくイメージはできましたか？

テスト用コードは以下になります．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    elems=input
    return tf.scan(lambda sum,elem: sum+elem,elems)

m=tf.keras.Sequential(My_Layer(input_shape=(1,)))
print(m.predict([1,2,3,4],batch_size=2),end="\n\n\n")

m=tf.keras.Sequential(My_Layer(input_shape=(2,)))
print(m.predict([[1,1],[3,3]],batch_size=2),end="\n\n\n")

m=tf.keras.Sequential(My_Layer(input_shape=(2,2)))
print(m.predict([[[0,0],[1,1]],[[2,2],[3,3]]],batch_size=2))

実行結果は以下になります．

[[1.]
 [3.]
 [3.]
 [7.]]

[[1. 1.]
 [4. 4.]]

[[[0. 0.]
  [1. 1.]]

 [[2. 2.]
  [4. 4.]]]

なお，引数にしているelem以外の変数も使用可能です．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    z=tf.constant(10,tf.float32)
    return tf.scan(lambda a,x: z*(a+x),input)

m=tf.keras.Sequential(My_Layer(input_shape=(1,)))
m.predict([1,2,3,4],batch_size=4)

結果は以下のようになります．

array([[1.00e+00],
       [3.00e+01],
       [3.30e+02],
       [3.34e+03]], dtype=float32)

しかし，tf.scanは欠点として，fn内での形状変形ができません．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    return tf.scan(lambda a,x: tf.concat([a,x],axis=0),input)

m=tf.keras.Sequential(My_Layer(input_shape=(1,1)))

このコードを実行すると以下のエラーが出力されます．

ValueError: Inconsistent shapes: saw (2, 1) but expected (1, 1)

なので，形状を変形したい場合には出力に対してtf.stackなどを使用するのが無難だと思います．

class My_Layer(tf.keras.layers.Layer):
  def call(self,input):
    z=tf.scan(lambda a,x: x,input)
    return tf.stack(z,axis=0)

m=tf.keras.Sequential(My_Layer(input_shape=(1,1)))
m.predict([1,2,3,4],batch_size=4)
#->array([[[1.]],

       [[2.]],

       [[3.]],

       [[4.]]], dtype=float32)

5.Google Colab

本ページのコードをGoogle Colabで実行できるようにしておきました．

URLはここー＞https://colab.research.google.com/drive/1RB9J-cZm2HHPs87uD7p8_KIig5Y6dOtw?usp=sharing

今回のプログラムを動かしてみたい人は上記のURLにアクセスして，自分のGoogle Driveにダウンロードするなどして，いじってみるといいかと思います．

今日は以上！

Cookie	期間	説明
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	期間	説明
__gads	1 year 24 days	This cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_ga_4K4VZRQYFW	2 years	This cookie is installed by Google Analytics.
_ga_RGFCBJ0MLC	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_204837783_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.