Upsampling for 2D convolution by tensorflow

A convolutional auto-encoder is usually composed of two sysmmetric parts, i.e., the encoder and decoder. By TensorFlow, it is easy to build the encoder part using modules like tf.contrib.layers or tf.nn, which encapsulate methods for convolution, downsampling, and dense operations.

However, as for the decoder part, TF does not provide method like upsampling, which is the reverse operation of downsampling (avg_pool2, max_pool2). This is because max pooling is applied more frequently than average pooling, while recover an image from max-pooled matrix is difficult for lossing of locations of the max points.

For the average-pooled feature maps, there is a simple way to realize upsampling without high-level API like keras, but with basic functions of TF itself.

Now, suppose the input is a 4-D tenser whose shape is [1, 4, 4, 1] and sampling rate is [1, 2, 2, 1], then the upsampled matrix is also a 4-D tenser of shape [1, 8, 8, 1]. Following lines can realize this operation.

import tensorflow as tf
x = tf.ones([1, 4, 4, 1])
k = tf.ones([2, 2, 1, 1]) # note k.shape = [rows, cols, depth_in, depth_output]
output_shape=[1, 8, 8, 1]
y = tf.nn.conv2d_transpose(
    value=x,
    filter=k,
    output_shape=output_shape,
    strides=[1, 2, 2, 1],
    padding='SAME'
        )
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(y))

Then, y is the upsampled matrix.

You may also realize upsampling by the resize_images function of moduletf.image, which is,

y = tf.image.resize_images(
    images=x,
    size=[1, 8, 8, 1],
    method=ResizeMethod.NEAREST_NEIGHBOR
        )

Enjoy yourself.

References

[1] Transposed convolution arithmetic

Upsampling for 2D convolution by tensorflow

References

Jason Ma