truth - みる会図書館

tr_images ground_truths for i in range(batch—size) : tf. random—crop( cropped—image image, (input—size, input—size, channels)) tf. image. resize_images( ground—truth cropped_image, (output—size' output—size)) tf. image. resize_images( resized image cropped—image' (input—size / / scale' input—size / / scate) , method=tf. image. ResizeMeth0d. BICUBIC) tf. image. resize_images( tr—image resized image, (input—size, input—size) , method=tf. image. ResizeMethod . BICUBIC) ground—truths. append(ground—truth) tr—images ・ append(lr—image) tf. stack(lr—images, axis=@) lr_images tf. stack(ground—truths, axis=@) ground_truths return tr_images / 255 , ground—truths / 255 ランダムで切り出した 33px の画像 (cropped-image) を縮小して高解像度の画像「 ground-truth 」とします。また、 cropped-image を縮小してから、もう一度拡大することで、画像を低解像度の画像「 tr-image 」を作成しています。 48 第 3 章超解像奮闘記

2. TensorFlowはじめました 2

cropped—image, (output_size, output_size) ) resized_image = tf. image. resize_nmages( cropped—image, (input—size / / scale, input_size / / scate) , method=tf. image . ResizeMethod . BICUBIC) lr—image tf. image. resize_images( resized_image, (input—size, input_size) , method=tf. image. ResizeMethod . BICUBIC) ground—truths. append(ground_truth) lr—images ・ append(lr—image) lr_images tf. stack(lr_images, axis=@) ground_truths tf. stack(ground—truths, axis=@) return lr_images / 255 , ground_truths / 255 ます、読み込んだファイルを tf. image. decode ー jpeg でデコードします次に、指定したサイズでランダムで切り抜き (tf. random_crop) 、画像 cropped_image を指定のサイズ (output_size) に縮小します。これが ground_truth 、モデルの出力として期待する高解像度の画像となります。また、 cropped-image を元に、解像度の低い画像 (tr_image) を作ります。一度 scale に指定された大きさに縮小 (resize) したあと、再び拡大することで擬似的に解像度を下げ ☆☆☆ 1 1 x 1 1 33X33 cropped—image 図関数 load_image の動作ます。 CNN 33X33 lr—image ground—truth 21X21 第 2 章 CNN で超解像 27

3. TensorFlowはじめました 2

filename—queue = tf. train. string_input_producer(file tist, shuffte=True) tf. WhoteFileReader() reader value reader. read(filename_queue) tf. image. decode—jpeg(value, channels=channels) 1 mage lr_images ground_truths for i in range(batch—size) : tf. random—crop( cropped—image image, (input—size, input—size, channets)) lr_image = tf. image . resize—images( cropped—image' (input—size / / scate, input_size / / scate) , method=tf. image . ResizeMethod . BICUBIC) lr_image = tf. image . resize_images( lr_image, (input—size, input_size) , method=tf. image . ResizeMeth0d . BICUBIC) offset_teft (input_size ー output_size) / / 2 offset_top (input_size output—size) / / 2 ground—truth tf. image. crop—to—bounding—box( cropped—image, offset—top, offset—left, output—size, output_size) lr—images ・ append(lr—image) ground—truths . append(ground—truth) tf. stack(lr_images, axis=@) images tf. stack(ground—truths, axis=@) ground_truths return tr_images / 255 , ground_truths / 255 また、 evat. py でプロックに分割するときのスライド幅 (stride) を、 MODEL. INPUT_SIZE から MODEL. OUTPUT_SIZE に変更します ( リスト 3.6 ) 。 50 第 3 章超解像奮闘記

4. TensorFlowはじめました 2

def __train(file_list, patches_count, train_dir) : checkpoint—path = OS . path . join(train—dir, 'model . ckpt') math . ceil(patches—count / FLAGS . batch_size) step—of—epoch toad—image ( images, ground—truths fite list, MODEL . INPUT_SIZE, MODEL . OUTPUT_SIZE, channeIs=MODEL . CHANNELS, scaIe=FLAGS. scale, batch_size=FLAGS . batch—size / / 4 ) FLAGS. min_after_dequeue + 4 ☆ FLAGS. batch_size capacity lr—image—batch, ground—truth-batch = tf. train . shuffte—batch( [images' ground—truths] , batch size=FLAGS . batch_size, capacity=capacity, enqueue_many=True, min_after_dequeue=FLAGS. min_after_dequeue, num_threads=FLAGS . num_threads) MODEL . inference(tr_image—batch) loss(sr—images, ground—truth-batch) sr_nmages IOSS tf.VariabIe(), trainable=FaIse) gtobal—step init_optimizer(FLAGS. learning—rate) opt opt. minimize(toss, global—step=global—step) train_op config = tf. ConfigProto( at10w_soft_pIacement=True, Iog—device—ptacement=FLAGS. tog—device—ptacement) saver = tf. train . Saver(tf. global—variables()) tf. train . Coordinator() coord with tf. Session(config=config) as sess: tf. train . get_checkpoint—state(train_dir) checkpoint if not (checkpoint and checkpoint. model—checkpoint—path) : sess . run(tf. global—variables—initializer()) else: 30 第 2 章 CNN で超解像

5. TensorFlowはじめました 2

1 . 学習データの読み込み 2 . 誤差関数の定義 3 . 最適化アルゴリズムの設定 4 . 学習の実行 # coding: UTF-8 リスト 2.3 : srcnn/image 」 oader_svs. py 像度の画像 (ground-truths) の組み合わせです。戻り値は、 batch ー size で指定した数の学習に使う低解像度の画像 (tr-images) と、高解ズ (input-size) 、モデルが出力する画像サイズ (output-size) を指定します。引数に読み込む画像ファイルのリスト (jpeg-file-list) と、モデルへ入力する画像サイリスト 2.3 の関数 load ー image は、画像の読み込み処理です。学習データの読み込み from from from future future_ future_ import absolute_import import division import print—function import tensorflow as tf def load_image(file—list, input—size' output—size' channets=l' scaIe=2, batch—size=l) : with tf. name_scope( 'image—loader—svs' ) : 26 tf. train. string—input—producer(fite—list, filename_queue shuffte=True) reader = tf. Wh01eFiteReader() reader. read(filename—queue) value tf. image ・ decode—jpeg(value, channets=channets) nmage lr_images ground_truths for i in range(batch—size) : cropped—image = tf. random—crop( image, (input—size, input—size' channets)) tf. image . resize_images( ground—truth 第 2 章 CNN で超解像

6. TensorFlowはじめました 2

ムを検討するようにしてます。リスト 2.5 : srcnn/train. py def __init_optimizer(learning_rate) : tf. train. AdamOptimizer(Iearning_rate) opt return opt 学習の実行関数 __train は、これまで作成したプログラムを使ってモデルの学習を行います ( リスト 2.6 ) 。 load-image で学習データの画像を読み込み、 images と ground_truths を得ます次に、 tf. trai n . shuffle_batch を通してミニバッチの数だけまとめた学習データを MODEL . inference に与えます。得られる結果 (sr-images) が、超解像処理を施した画像です。 sr—images と ground—truth—batch の誤差を計算 (__toss して、 opt . minimize で誤差が小さくなるようにパラメーターを更新します。リスト 2.6 : srcnn/train. py tf. app . flags . FLAGS FLAGS tf. app . ftags . DEFINE—string('image_dir' tf. app . flags. DEFINE_integer('batch_size' None, " 学習画像のディレクトリ " ) 128 " ニバッチのサイズ " ) 0 . 001 , " 学習率 " ) tf. app. flags. DEFINE—float('learning_rate' tf. app. ftags. DEFINE_string('train_dir' . /train_sr' " 学習結果を保存するディレクトリ " ) tf. app. flags. DEFINE_integer('scate' 第 2 章 CNN で超解像 MODEL = m0de1915 # 使用するモデルを設定 tf. app . flags . DEFINE—integer('max_step' , ー 1 , " 学習する最大ステップ数 " ) 16 , " 処理するスレッド数 " ) tf. app . ftags. DEFINE_integer('num_threads' " dequeue をはじめるサンプル数 " ) 30000 , tf. app. flags. DEFINE—integer('min_after—dequeue' " op が実行されるデバイスを表示するかを選択 " ) Fatse, tf. app. flags. DEFINE—bootean( 'tog—device_placement' 29

7. TensorFlowはじめました 2

図 3.9 image 」 oader_svs. py の動作 33X33 cropped—image lr—image 33X33 1 1 x 1 1 0 → 0 CNN ground—truth 21X21 これまで見てきたように、超解像を実現するには周辺部の特徴を取り込む必要があります。しかし、畳み込み層でパディングを設定してもプロックの境目にノイズが発生します。そこで、学習時にパディングの領域を含めて読み込むように変更します。 conv2d で 0 パディングを追加するのでなく、失われるピクセル分、大きめに画像を読み込みます ( 図 3.10 ) 。図 3.10 変更後の動作 cropped—image 33X33 lr—image 33X33 ☆☆☆ CNN ground—truth 21X21 リスト 3.5 は、変更後のプログラムです。リスト 3.5 : srcnn/image 」 oader_vvv. py def toad_image(file_list, input_size, output_size, scaIe=2, batch_size=l) : with tf. name_scope( 'image_loader vvv' ) : channels=l, 第 3 章超解像奮闘記 49

8. TensorFlowはじめました 2

それぞれの画像は [height, width, channel] の Tensor です。これらを追加したリストを、 stack で連結することで [batch_size, height, width, channet] の Tensor を得ます。また、機械学習の学習アルゴリズムには、データの値域が広すぎると有効に機能しないものがあります。そのため、最後にそれぞれの値を 255 で除算して、ピクセルの値域 ( 0 ~ 255 ) を @ . @~ 1 . @ に正規化 (Normalize) しています 2 ⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢ 1 枚の画像から切り出すサンプル数 for を使って、一枚の画像から複数の画像を c 「 op ( 切り出し ) をしているのは、 CPU の負荷を下げるのが目的です。筆者は当初、 f 。「を使わず、 1 枚のサンプル画像から 1 枚の学習用画像を切り出していました。しかし、プログラムを検証環境で実行した際に、 CPU と I/O が高負荷になって GPU の性能を活かせないという問題が発生しました。たとえば 128 のミニバッチで学習した場合、大きなサンプル画像から学習用画像を 1 枚しか切り出さないと、一回のバッチで 128 回のファイルの読み込みとデコードが発生することになります。そこで、リスト 2.3 のように、 1 枚の画像をデコードしたあと、まとまった数のサンプルを取得することで、 C PU と I/O の負荷が下がり、 GPU の性能を発揮することができるようになりました。 ⅱⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅱⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅢⅱⅢⅢⅢⅢⅢⅢ 誤差関数の定義誤差関数 (Loss Function) で、入力をモデルが処理した出力が、期待する出力とどの程度離れているかを算出します。関数ーユ。 ss は、画像を扱うときに一般的な MSE (Mean Squared Error: 平均二乗誤差 ) を最適化アルゴリズムの設定 tf. squared—difference(sr—images, ground—truth)) return tf. reduce_mean( def toss(sr—images, ground—truth) : リスト 2.4srcnn/trai n. py 用いて誤差を求めます ( リスト 2.4 ) 。関数 __init_optimizer は、 AdamOptimizer を最適化アルゴリズムに設定します ( リスターを更新します。最適化アルゴリズムは、誤差関数が算出した誤差をもとに、誤差が小さくなるようにパラメーず Adam を使用してみて、ト 2.5 ) 。どうしても求める精度や性能が出なければ、別の最適化アルゴリズ Adam (Optimizer) は、 2014 年に発表された最適化アルゴリズムです。筆者は学習の際、ま 28 第 2 章 CNN で超解像

9. TensorFlowはじめました 2

MODEL . inference(tr_image_batch, is_train=True, extra_ops=extra_ops) extra_ops sr_nmages IOSS loss(sr—images, ground—truth—batch) gtobal—step tf.VariabIe(), trainabIe=Fatse) init_optimizer(FLAGS . learning_rate) opt train—op opt. minimize(loss, globat_step=global—step) tf. group(train—op, *extra—ops) 図 4.13 は、表 4.5 の条件で学習した場合の誤差の変化です。表 4.5 学習条件条件画像の倍率 (scale) 学習率 (learning-rate) ニバッチサイズ (batch_size) ステップ数 (max_step) チャンネル数 (CHANNELS) train—op 2 3 (RGB カラ図 4.13 誤差の変イヒ (smoothing=O. 9) 6.88-3 5.3-7 ( 総 -3 町工 0 ま 3 3 ℃ 0 コチ 3 2. C00 チ 3 1 ℃ 0 ( 辱 3 C.a コ 28- 18. 0.80 4C00k 0 BatchNormalization がないモデルと比べると幾分、誤差のバラッキが小さくなっているようです。図 4.14 は、評価用の画像を 2 倍に拡大後、学習済みのモデルを使って超解像処理をした結果第 4 章さまざまなモデル 73

10. TensorFlowはじめました 2

m a 1 n name tf. app . run() 表 4.2 : 画質指標の比較モデノレ BICUBIC mode1915-sigmoid mode1955-sigmoid SSI M 0.85 0. 18 0 29 PSNR 21. 跖 23.20 23.18 モデル 9 ー 1 ー 5 とモデル 9 ー 5 ー 5 を高解像度画像と比較した結果は、表 value-models の通りです。 PSNR はわずかながら低下しています。一方、 SSIM では向上していますが、どちらも誤差の範囲内のように思えます。カラー画像対応モデルを 3 チャンネル (RGB) のカラー画像を処理できるように変更します。リスト mode1955—sigmoid-color は、定数 CHANNELS を変更することで 3 チャンネル (RGB) に対応したモデルを構築します。関数 inference は、リスト 4.1 と同じです。リスト 4.3 : srcnn/modeI/mode1955—sigmoid—color. py 'mode1955_sigmoid_coIor' NAME INPUT S 工 ZE 33 INPUT SIZE OUTPUT SIZE CHANNELS # 関数 inference は、 m0de1955 ー sigmoid . py のものと同じ画像の読み込みの際、引数 channels にチャンネル数の値を指定していたことを思い出してください ( リスト 4.4 ) 。モデルの定数を変更すると、自動的に 3 チャンネルの画像として読み込まれます。リスト 4.4 : srcnn/train. py train(fite_list, patches—count, train—di r) : def checkpoint—path = OS. path. join(train—dir' 'modet. ckpt') step—of—epoch = math. ceil(patches—count / FLAGS. batch—size) ( 5 ー 1 ) ( 5 ー 3 load—image( images, ground—truths fite tist, 65 第 4 章さまざまなモデル