摘要:大多数当前的语音识别系统都使用隐马尔科夫模型(HMMs)来解决语音中的时间变化问题,用混合高斯模型(GMMs)来评价每一个HMM拟合声音输入表示帧或者小窗口帧系数的效果。存在一种替代评价方法是使用前馈神经网络来将多个帧系数作为输入,将HMM状态的后验概率作为输出。深度神经网络有很多隐藏层,通过新的方法进行训练,在很多语音识别任务上都比GMM模型更加出众,有时甚至会好非常多。本文将会做一个综述,分别对四家研究机构在最近语音识别的声学建模领域取得的成功进行介绍。 基于上下文预训练的深度神经网络在大规模词表语音识别中的应用 Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition (2012) 作者G. Dahl et al. 使用深度置信网络进行声学建模 Acoustic modeling using deep belief networks (2012) 作者A. Mohamed et al. 强化学习 深度视觉运动策略的端到端训练 End-to-end training of deep visuomotor policies (2016), 作者S. Levine et al. 利用深度学习和大规模数据搜集,学习眼手协调的机器人抓取 Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection (2016) 作者 S. Levine et al. 深度强化学习的异步方法 Asynchronous methods for deep reinforcement learning (2016) 作者V. Mnih et al. 使用双Q学习的深度强化学习 Deep Reinforcement Learning with Double Q-Learning (2016) 作者 H. Hasselt et al. 通过深度神经网络和树搜索来掌控围棋游戏 Mastering the game of Go with deep neural networks and tree search (2016) 作者 D. Silver et al. 摘要:围棋被视为人工智能挑战经典游戏中最难的一个,因为其巨大的搜索空间和对位置和移动的评价难度。本文提出了一种新方法使用“值网络”来评价位置,用“策略网络”来选择移动。这些深度神经网络是从人类专家棋局中进行有监督学习,然后在从自对弈中进行强化学习。如果不考虑前向搜索的话,当前最好的神经网路模型是蒙特卡洛树搜索,这种方法通过进行上千局的自对弈来进行仿真。我们也介绍了一种新点的搜索算法,将蒙特卡洛仿真与值网络和策略网络进行了综合。使用这种搜索算法,我们的项目AlphaGo有99.8%的胜率,并且以5:0的比分打败了来自欧洲的人类冠军。这也是计算机第一次在真实围棋比赛中击败人类专业选手,将10年后的目标提前完成了。 采用深度强化学习进行持续控制 Continuous control with deep reinforcement learning (2015) 作者T. Lillicrap et al. 通过深度强化学习实现人类水平控制 Human-level control through deep reinforcement learning (2015) 作者V. Mnih et al. 侦测机器人抓取的深度学习 Deep learning for detecting robotic grasps (2015) 作者 I. Lenz et al. 用强化学习玩atari游戏 Playing atari with deep reinforcement learning (2013) 作者V. Mnih et al. 理解/概括/传递2016年的更多论文 Layer Normalization (2016), J. Ba et al. Learning to learn by gradient descent by gradient descent (2016), M. Andrychowicz et al. Domain-adversarial training of neural networks (2016), Y. Ganin et al. WaveNet: A Generative Model for Raw Audio (2016), A. Oord et al. Colorful image colorization (2016), R. Zhang et al. Generative visual manipulation on the natural image manifold (2016), J. Zhu et al. Texture networks: Feed-forward synthesis of textures and stylized images (2016), D Ulyanov et al. SSD: Single shot multibox detector (2016), W. Liu et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size (2016), F. Iandola et al. Eie: Efficient inference engine on compressed deep neural network (2016), S. Han et al. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1 (2016), M. Courbariaux et al. Dynamic memory networks for visual and textual question answering (2016), C. Xiong et al. Stacked attention networks for image question answering (2016), Z. Yang et al. Hybrid computing using a neural network with dynamic external memory (2016), A. Graves et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (2016), Y. Wu et al. 100篇之外 新论文:最近6个月以内的 Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models, S. Ioffe. Wasserstein GAN, M. Arjovsky et al. Understanding deep learning requires rethinking generalization, C. Zhang et al. [pdf] 老论文:2012年以前的 An analysis of single-layer networks in unsupervised feature learning (2011), A. Coates et al. Deep sparse rectifier neural networks (2011), X. Glorot et al. Natural language processing (almost) from scratch (2011), R. Collobert et al. Recurrent neural network based language model (2010), T. Mikolov et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion (2010), P. Vincent et al. Learning mid-level features for recognition (2010), Y. Boureau A practical guide to training restricted boltzmann machines (2010), G. Hinton Understanding the difficulty of training deep feedforward neural networks (2010), X. Glorot and Y. Bengio Why does unsupervised pre-training help deep learning (2010), D. Erhan et al. Recurrent neural network based language model (2010), T. Mikolov et al. Learning deep architectures for AI (2009), Y. Bengio. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations (2009), H. Lee et al. Greedy layer-wise training of deep networks (2007), Y. Bengio et al. Reducing the dimensionality of data with neural networks, G. Hinton and R. Salakhutdinov. A fast learning algorithm for deep belief nets (2006), G. Hinton et al. Gradient-based learning applied to document recognition (1998), Y. LeCun et al. Long short-term memory (1997), S. Hochreiter and J. Schmidhuber. HW/SW/数据集:技术报告 OpenAI gym (2016), G. Brockman et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems (2016), M. Abadi et al. Theano: A Python framework for fast computation of mathematical expressions, R. Al-Rfou et al. MatConvNet: Convolutional neural networks for matlab (2015), A. Vedaldi and K. Lenc Imagenet large scale visual recognition challenge (2015), O. Russakovsky et al. Caffe: Convolutional architecture for fast feature embedding (2014), Y. Jia et al. 专著/调查报告/综述 Deep learning (Book, 2016), Goodfellow et al. LSTM: A search space odyssey (2016), K. Greff et al. Deep learning (2015), Y. LeCun, Y. Bengio and G. Hinton Deep learning in neural networks: An overview (2015), J. Schmidhuber Representation learning: A review and new perspectives (2013), Y. Bengio et al. 附录:未收录的其他优秀论文 Dermatologist-level classification of skin cancer with deep neural networks (2017), A. Esteva et al. Weakly supervised object localization with multi-fold multiple instance learning (2017), R. Gokberk et al. Brain tumor segmentation with deep neural networks (2017), M. Havaei et al. (2016) Professor Forcing: A New Algorithm for Training Recurrent Networks (2016), A. Lamb et al. Adversarially learned inference (2016), V. Dumoulin et al. Understanding convolutional neural networks (2016), J. Koushik Taking the human out of the loop: A review of bayesian optimization (2016), B. Shahriari et al. Adaptive computation time for recurrent neural networks (2016), A. Graves Densely connected convolutional networks (2016), G. Huang et al. Continuous deep q-learning with model-based acceleration (2016), S. Gu et al. A thorough examination of the cnn/daily mail reading comprehension task (2016), D. Chen et al. Achieving open vocabulary neural machine translation with hybrid word-character models, M. Luong and C. Manning. Very Deep Convolutional Networks for Natural Language Processing (2016), A. Conneau et al. Bag of tricks for efficient text classification (2016), A. Joulin et al. Efficient piecewise training of deep structured models for semantic segmentation (2016), G. Lin et al. Learning to compose neural networks for question answering (2016), J. Andreas et al. Perceptual losses for real-time style transfer and super-resolution (2016), J. Johnson et al. Reading text in the wild with convolutional neural networks (2016), M. Jaderberg et al. What makes for effective detection proposals? (2016), J. Hosang et al. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks (2016), S. Bell et al. Instance-aware semantic segmentation via multi-task network cascades (2016), J. Dai et al. Conditional image generation with pixelcnn decoders (2016), A. van den Oord et al. Deep networks with stochastic depth (2016), G. Huang et al., 由于微信字数限制,要浏览 2015 年(含)前的论文,请访问:https://github.com/terryum/awesome-deep-learning-papers/blob/master/README.md 【在新智元后台输入“论文100”下载这份经典资料】 【寻找AI独角兽】新智元联手10大资本 启动2017创业大赛 (责任编辑:本港台直播) |