Restricted Boltzmann Machines and Deep Belief Networks on Multi-Core Processors



Deep learning architecture models by contrast with shallow models draw on the insights of biological inspiration which has been a challenge since the inception of the idea of simulating the brain. In particular their (many) hierarchical levels of composition track the development of parallel implementation in an attempt to become accessibly fast. When it comes to performance enhancement Graphics Processing Units (GPU) have carved their own strength in machine learning. In this paper, we present an approach that relies mainly on three kernels for implementing both the Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) algorithms. Instead of considering the neuron as the smallest unit of computation
each thread represents the connection between two (one visible and one hidden) neurons. Although conceptually it may seem weird, the rationale behind is to think of a connection as performing a simple function that multiplies the clamped input
by its weight. Thus, we maximize the GPU workload avoiding idle cores. Moreover, we placed great emphasis on the kernels to avoid uncoalesced memory accesses as well as to take advantage of the shared memory to reduce global memory accesses. Additionally, our approach uses a step adaptive learning rate procedure which accelerates convergence. The approach yields very good speedups
(up to 46x) as compared with a straightforward implementation when both GPU and CPU implementations are tested on the MINST database.


Deep Learning, GPU Computing, Restricted Boltzmann Machines, Deep Belief Networks


Deep Learning, GPU Computing


IEEE World Congress on Computational Intelligence (WCCI 2012), IEEE, Brisbane, Australia (DOI: 10.1007/978-3-642-32639-4_90), June 2012


Cited by

Year 2016 : 2 citations

 Patrawut Ruangkanokmas, Tiranee Achalakul and Khajonpong Akkarajitsakul. "Deep Belief Networks with Feature Selection for Sentiment Classification". 7th International Conference on Intelligent Systems, Modelling and Simulation (2016).

 Brito, R., Fong, S., Cho, K., Song, W., Wong, R., Mohammed, S., & Fiaidhi, J. (2016). GPU-enabled back-propagation artificial neural network for digit recognition in parallel. The Journal of Supercomputing, 1-19.

Year 2015 : 2 citations

 Li, T., Dou, Y., Jiang, J., Wang, Y., & Lv, Q. (2015, July). Optimized deep belief networks on CUDA GPUs. In Neural Networks (IJCNN), 2015 International Joint Conference on (pp. 1-8). IEEE.

 Satoshi Masaki and Sato Kosin pair. "Learning faster DBN by the learning data parallelism with MPI". Information Processing Society 77th Annual National Convention 3 (2015): 08.

Year 2014 : 2 citations

 Ahn, Byungik. "Computation of deep belief networks using special-purpose hardware architecture." Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE, 2014.

 Thompson, Elizabeth A., and Timothy R. Anderson. "A CUDA implementation of the Continuous Space Language Model." The Journal of Supercomputing 68.1 (2014): 65-86.

Year 2013 : 3 citations

 Zhu, Yun; Zhang, Yanqing; Pan, Yi, "Large-scale restricted boltzmann machines on single GPU", IEEE International Conference on Big Data, pp.169-174, 2013

 Popovi?, B., Ostrogonac, S., Deli?, V., Janev, M., & Stankovi?, I. (2013). Deep architectures for automatic emotion recognition based on lip shape. In 12th International Scientific Professional Symposium INFOTEH-JAHORINA, Jahorina, Bosnia and Herzegovina (pp. 939-943).

 Xueshao Fei, Song Yan and Dai Lirong "Fast training method based on multi-GPU deep neural networks". Journal of Tsinghua University : Natural Science Edition 6 (2013): 745-748.