withtf.Session() assess: sess.run(tf.initialize_all_variables()) plt.ion() plt.figure() plt.show() loss_list = [] forepoch_idx inrange(num_epochs): x,y = generateData() _current_state = np.zeros((batch_size, state_size)) print( "New data, epoch", epoch_idx) forbatch_idx inrange(num_batches): start_idx = batch_idx * truncated_backprop_length end_idx = start_idx + truncated_backprop_length batchX = x[:,start_idx:end_idx] batchY = y[:,start_idx:end_idx] _total_loss, _train_step, _current_state, _predictions_series = sess.run( [total_loss, train_step, current_state, predictions_series], feed_dict={ batchX_placeholder:batchX, batchY_placeholder:batchY, init_state:_current_state }) loss_list.append(_total_loss) ifbatch_idx% 100== 0: print( "Step",batch_idx, "Loss", _total_loss) plot(loss_list, _predictions_series, batchX, batchY)plt.ioff()plt.show() 从第15-19行可以看出,在每次迭代中往前移动truncated_backprop_length步,但可能有不同的stride值。这样做的缺点是,为了封装相关的训练数据,truncated_backprop_length的值要显著大于时间依赖值(本文中为3步),否则可能会丢失很多有效信息,如图6所示。
图6:数据示意图 我们用多个正方形来代表时间序列,上升的黑色方块表示回波输出,由输入回波(黑色方块)经过三次激活后得到。滑动批处理窗口在每次运行时也滑动了三次,在示例中之前没有任何批数据,用来封装依赖关系,因此它不能进行训练。 请注意,本文只是用一个简单示例解释了RNN如何工作,可以轻松地用几行代码中来实现此网络。此网络将能够准确地了解回声行为,因此不需要任何测试数据。 在训练过程中,该程序实时更新图表,如图7所示。蓝色条表示用于训练的输入信号,红色条表示训练得到的输出回波,绿色条是RNN网络产生的预测回波。不同的条形图显示了在当前批次中多个批数据的预测回波。 我们的算法能很快地完成训练任务。左上角的图表输出了损失函数,但为什么曲线上有尖峰?答案就在下面。
图7:各图分别为Loss,训练的输入和输出数据(蓝色和红色)以及预测回波(绿色)。 尖峰的产生原因是在新的迭代开始时,会产生新的数据。由于矩阵重构,atv,每行上的第一个元素与上一行中的最后一个元素会相邻。但是所有行中的前几个元素(第一个除外)都具有不包含在该状态中的依赖关系,因此在最开始的批处理中,网络的预测功能不良。 整个程序 这是完整实现RNN网络的程序,只需复制粘贴即可运行。如果对文章有什么疑问,欢迎加量子位小助手qbitbot,注明“加入门群”并做个自我介绍,小助手将带你和更多小伙伴交流讨论。 from__future__ importprint_function, division importnumpy asnp importtensorflow astf importmatplotlib.pyplot aspltnum_epochs = 100total_series_length = 50000truncated_backprop_length = 15state_size = 4num_classes = 2echo_step = 3batch_size = 5num_batches = total_series_length//batch_size//truncated_backprop_length defgenerateData():x = np.array(np.random.choice( 2, total_series_length, p=[ 0.5, 0.5])) y = np.roll(x, echo_step) y[ 0:echo_step] = 0x = x.reshape((batch_size, - 1)) # The first index changing slowest, subseries as rowsy = y.reshape((batch_size, - 1)) return(x, y)batchX_placeholder = tf.placeholder(tf.float32, [batch_size, truncated_backprop_length])batchY_placeholder = tf.placeholder(tf.int32, [batch_size, truncated_backprop_length])init_state = tf.placeholder(tf.float32, [batch_size, state_size])W = tf.Variable(np.random.rand(state_size+ 1, state_size), dtype=tf.float32)b = tf.Variable(np.zeros(( 1,state_size)), dtype=tf.float32)W2 = tf.Variable(np.random.rand(state_size, num_classes),dtype=tf.float32)b2 = tf.Variable(np.zeros(( 1,num_classes)), dtype=tf.float32) # Unpack columnsinputs_series = tf.unpack(batchX_placeholder, axis= 1)labels_series = tf.unpack(batchY_placeholder, axis= 1) # Forward passcurrent_state = init_statestates_series = [] forcurrent_input ininputs_series: current_input = tf.reshape(current_input, [batch_size, 1]) input_and_state_concatenated = tf.concat( 1, [current_input, current_state]) # Increasing number of columnsnext_state = tf.tanh(tf.matmul(input_and_state_concatenated, W) + b) # Broadcasted additionstates_series.append(next_state) current_state = next_statelogits_series = [tf.matmul(state, W2) + b2 forstate instates_series] #Broadcasted additionpredictions_series = [tf.nn.softmax(logits) forlogits inlogits_series]losses = [tf.nn.sparse_softmax_cross_entropy_with_logits(logits, labels) forlogits, labels inzip(logits_series,labels_series)]total_loss = tf.reduce_mean(losses)train_step = tf.train.AdagradOptimizer( 0.3).minimize(total_loss) defplot(loss_list, predictions_series, batchX, batchY):plt.subplot( 2, 3, 1) plt.cla() plt.plot(loss_list) forbatch_series_idx inrange( 5): one_hot_output_series = np.array(predictions_series)[:, batch_series_idx, :] single_output_series = np.array([( 1ifout[ 0] < 0.5else0) forout inone_hot_output_series]) plt.subplot( 2, 3, batch_series_idx + 2) plt.cla() plt.axis([ 0, truncated_backprop_length, 0, 2]) left_offset = range(truncated_backprop_length) plt.bar(left_offset, batchX[batch_series_idx, :], width= 1, color= "blue") plt.bar(left_offset, batchY[batch_series_idx, :] * 0.5, width= 1, color= "red") plt.bar(left_offset, single_output_series * 0.3, width= 1, color= "green") plt.draw() plt.pause( 0.0001) withtf.Session() assess: sess.run(tf.initialize_all_variables()) plt.ion() plt.figure() plt.show() loss_list = [] forepoch_idx inrange(num_epochs): x,y = generateData() _current_state = np.zeros((batch_size, state_size)) print( "New data, epoch", epoch_idx) forbatch_idx inrange(num_batches): start_idx = batch_idx * truncated_backprop_length end_idx = start_idx + truncated_backprop_length batchX = x[:,start_idx:end_idx] batchY = y[:,start_idx:end_idx] _total_loss, _train_step, _current_state, _predictions_series = sess.run( [total_loss, train_step, current_state, predictions_series], feed_dict={ batchX_placeholder:batchX, batchY_placeholder:batchY, init_state:_current_state }) loss_list.append(_total_loss) ifbatch_idx% 100== 0: print( "Step",batch_idx, "Loss", _total_loss) plot(loss_list, _predictions_series, batchX, batchY)plt.ioff()plt.show() 招聘 我们正在招募编辑记者、运营等岗位,工作地点在北京中关村,期待你的到来,一起体验人工智能的风起云涌。 相关细节,请在公众号对话界面,回复:“招聘”两个字。 One More Thing… 今天AI界还有哪些事值得关注?在量子位(QbitAI)公众号会话界面回复“今天”,看我们全网搜罗的AI行业和研究动态。笔芯~ 追踪人工智能领域最劲内容 (责任编辑:本港台直播) |