Towards Adaptive Learning with Improved Convergence of Deep Belief Networks on Graphics Processing Units



In this paper we focus on two complementary approaches to significantly decrease pre-training time of a Deep Belief Network (DBN). First, we propose an adaptive step size technique to enhance the convergence of the Contrastive Divergence (CD) algorithm, thereby reducing the number of epochs to train the Restricted Boltzmann Machines (RBMs) that support the DBN infrastructure. Second, we present a highly-scalable Graphics Processing Unit (GPU) parallel implementation of the CD–k algorithm, which boosts notably the training speed. Additionally, extensive experiments are conducted on the MNIST and the HHreco databases. The results suggest that the maximum useful depth of an DBN is related to the number and quality of the training samples. Moreover, it was found that the lower-level layer plays a fundamental role for building successful DBN models. Furthermore, the results contradict the pre-conceived idea that all the layers should be pre-trained. Finally, it is shown that by incorporating Multiple Back-Propagation (MBP) layers, the DBNs generalization capability is remarkably improved.


Deep Learning, Deep Belief Networks, Restricted Boltzmann Machines, Contrastive Divergence, Adaptive Step Size, GPU Computing


Machine Learning, GPU computing


Pattern Recognition, Elsevier , Vol. 47, #1, pp. 114-127, Elsevier, December 2014


Cited by

Year 2016 : 1 citations

 Luo, J., & Gao, H. (2016). Deep Belief Networks for Fingerprinting Indoor Localization Using Ultrawideband Technology. International Journal of Distributed Sensor Networks, 2016.

Year 2015 : 7 citations

 Xu, Q., Jiang, S., Huang, W., Duan, L., & Xu, S. Multi-feature fusion based spatial pyramid deep neural networks image classification. Computer Modelling & New Technologies, 17, 207-212 (2015).

 Li, T., Dou, Y., Jiang, J., Wang, Y., & Lv, Q. (2015, July). Optimized deep belief networks on CUDA GPUs. In Neural Networks (IJCNN), 2015 International Joint Conference on (pp. 1-8). IEEE.

 Li, Z. Z., Zhong, Z. Y., & Jin, L. W. (2015). Identifying Best Hyperparameters for Deep Architectures Using Random Forests. In Learning and Intelligent Optimization (pp. 29-42). Springer International Publishing.

 Lv, Q., Dou, Y., Niu, X., Xu, J., Xu, J., & Xia, F. (2015). Urban Land Use and Land Cover Classification Using Remotely Sensed SAR Data through Deep Belief Networks. Journal of Sensors, 2015.

 Qiu, J., Liang, W., Zhang, L., Yu, X., & Zhang, M. (2015). The early-warning model of equipment chain in gas pipeline based on DNN-HMM. Journal of Natural Gas Science and Engineering, 27, 1710-1722.

 Wlodarczak, P., Soar, J., & Ally, M. (2015, October). Multimedia data mining using deep learning. In Digital Information Processing and Communications (ICDIPC), 2015 Fifth International Conference on (pp. 190-196). IEEE.

 Gao Qiang, Yang Wu , & Li Qian . (2015). DBN image classification based on the spatial information quickly train the model . Journal of System Simulation , ( 3 ) , 549-558 .

Year 2014 : 2 citations

 Fang, H., & Hu, C. (2014, July). Recognizing human activity in smart home using deep learning algorithm. In Control Conference (CCC), 2014 33rd Chinese (pp. 4716-4720). IEEE.

 Parada, P. Peso, et al. "A quantitative comparison of blind C 50 estimators." Acoustic Signal Enhancement (IWAENC), 2014 14th International Workshop on. IEEE, 2014.