Machine Learning Big Data Framework and Analytics for Big Data Problems



Generally, big data computing deals with massive and high dimensional data such as DNA microrray data, financial data, medical imagery, satellite imagery and hyperspectral imagery. Therefore, big data computing needs advanced technologies or methods to solve the issues of computational time to extract valuable information without information loss. In this context, generally, Machine Learning (ML) algorithms have been considered to learn and find useful and valuable information from large value of data. However, ML algorithms such as Neural Networks are computationally expensive, and typically the central processing unit (CPU) is unable to cope with these requirements. Thus, we need high performance computer to execute faster solutions such Graphical Processing Unit (GPU). GPUs provide remarkable performance gains compared to CPUs. The GPU is relatively inexpensive with affordable price, availability and scalability. Since 2006, NVIDIA provides simplification of the GPU programming model with the Compute Unified Device Architecture (CUDA), which supports for accessible programming interfaces and industry-standard languages, such as C and C++. Since then, General Purpose Graphical Processing Unit (GPGPU) using ML algorithms are applied on various applications; including signal and image pattern classification in biomedical area. The importance of fast analysis of detecting cancer or non-cancer becomes the motivation of this study. Accordingly, we proposed machine learning framework and analytics of Self Organizing Map (SOM) and Multiple Back Propagation (MBP) for big biomedical data classification problems. Big data such as gene expression datasets are executed on high performance computer and Fermi architecture graphical hardware. Based on the experiment, MBP and SOM with GPU - Tesla generates faster computing times than high performance computer with feasible results in terms of speed performance.


Big Data


Big Data


Int. J. Advance Soft Compu. Appl (IJASCA), Vol. 6, #2, July 2014

Cited by

Year 2015 : 2 citations

 Wienhofen, L., Mathisen, B. M., & Roman, D. (2015). Empirical Big Data Research: A Systematic Literature Mapping. arXiv preprint arXiv:1509.03045.

 Ali, A., Shamsuddin, S. M., & Ralescu, A. L. (2015). Classification with class imbalance problem: A Review. Int. J. Advance Soft Compu. Appl, 7(3).