【j2开奖】专栏 | 手机端运行卷积神经网络实践：基于TensorFlow和OpenCV实现文档检测功能(4)_本港台直播_J2开奖直播

论文给出的 HED 网络是一个通用的边缘检测网络，按照论文的描述，每一个尺度上得到的 image，都需要参与 cost 的计算，这部分的代码如下：

　　input_queue_for_train = tf.train.string_input_producer([FLAGS.csv_path])

　　image_tensor, annotation_tensor = input_image_pipeline(dataset_root_dir_string, input_queue_for_train, FLAGS.batch_size)

　　dsn_fuse, dsn1, dsn2, dsn3, dsn4, dsn5 = hed_net(image_tensor, FLAGS.batch_size)

　　cost = class_balanced_sigmoid_cross_entropy(dsn_fuse, annotation_tensor) +

　　class_balanced_sigmoid_cross_entropy(dsn1, annotation_tensor) +

　　class_balanced_sigmoid_cross_entropy(dsn2, annotation_tensor) +

　　class_balanced_sigmoid_cross_entropy(dsn3, annotation_tensor) +

　　class_balanced_sigmoid_cross_entropy(dsn4, annotation_tensor) +

　　class_balanced_sigmoid_cross_entropy(dsn5, annotation_tensor)

按照这种方式训练出来的网络，检测到的边缘线是有一点粗的，为了得到更细的边缘线，通过多次试验找到了一种优化方案，代码如下：

input_queue_for_train = tf.train.string_input_producer([FLAGS.csv_path])

image_tensor, annotation_tensor = input_image_pipeline(dataset_root_dir_string, input_queue_for_train, FLAGS.batch_size)

dsn_fuse, _, _, _, _, _ = hed_net(image_tensor, FLAGS.batch_size)

cost = class_balanced_sigmoid_cross_entropy(dsn_fuse, annotation_tensor)

也就是不再让每个尺度上得到的 image 都参与 cost 的计算，只使用融合后得到的最终 image 来进行计算。

两种 cost 函数的效果对比如下图所示，右侧是优化过后的效果：

【j2开奖】专栏 | 手机端运行卷积神经网络实践：基于TensorFlow和OpenCV实现文档检测功能

另外还有一点，按照 HED 论文里的要求，计算 cost 的时候，不能使用常见的方差 cost，而应该使用 cost-sensitive loss function，代码如下：

　　def class_balanced_sigmoid_cross_entropy(logits, label,):

"""

The class-balanced cross entropy loss,

as in `Holistically-Nested Edge Detection

<>`_.

This is more numerically stable than class_balanced_cross_entropy

:param logits: size: the logits.

:param label: size: the ground truth in {0,1}, of the same shape as logits.

:returns: a scalar. class-balanced cross entropy loss

"""

y = tf.cast(label, tf.float32)

count_neg = tf.reduce_sum(1. - y) # the number of 0 in y

count_pos = tf.reduce_sum(y) # the number of 1 in y (less than count_neg)

beta = count_neg / (count_neg + count_pos)

pos_weight = beta / (1 - beta)

cost = tf.nn.weighted_cross_entropy_with_logits(logits, y, pos_weight)

cost = tf.reduce_mean(cost * (1 - beta), name=name)

return cost

转置卷积层的双线性初始化

在尝试 FCN 网络的时候，就被这个问题卡住过很长一段时间，按照 FCN 的要求，在使用转置卷积 (transposed convolution)/ 反卷积 (deconv) 的时候，要把卷积核的值初始化成双线性放大矩阵 (bilinear upsampling kernel)，而不是常用的正态分布随机初始化，同时还要使用很小的学习率，这样才更容易让模型收敛。

HED 的论文中，并没有明确的要求也要采用这种方式初始化转置卷积层，但是，在训练过程中发现，采用这种方式进行初始化，模型才更容易收敛。

这部分的代码如下：

　　def get_kernel_size(factor):

"""

Find the kernel size given the desired factor of upsampling.

"""

return 2 * factor - factor % 2

def upsample_filt(size):

"""

Make a 2D bilinear kernel suitable for upsampling of the given (h, w) size.

"""

factor = (size + 1) // 2

if size % 2 == 1:

center = factor - 1

else:

center = factor - 0.5

og = np.ogrid[:size, :size]

return (1 - abs(og[0] - center) / factor) * (1 - abs(og[1] - center) / factor)

def bilinear_upsample_weights(factor, number_of_classes):

"""

Create weights matrix for transposed convolution with bilinear filter

initialization.

"""

filter_size = get_kernel_size(factor)

weights = np.zeros((filter_size,

filter_size,

number_of_classes,

number_of_classes), dtype=np.float32)

upsample_kernel = upsample_filt(filter_size)

for i in xrange(number_of_classes):

weights[:, :, i, i] = upsample_kernel

return weights

训练过程冷启动

HED 网络不像 VGG 网络那样很容易就进入收敛状态，也不太容易进入期望的理想状态，主要是两方面的原因：

(责任编辑：本港台直播)