Comparison of data mining techniques and tools for data classification



Data Mining is a knowledge field that intersects domains from computer science and statistics, attempting to discover knowledge from databases in order to facilitate the decision making process. Classification is a Data Mining task that learns from a collection of cases in order to accurately predict the target class for new cases. Several machine learning techniques can be used to perform classification. Free and open source Data Mining software tools are available from the Internet that offers the capability of performing classification through different techniques. This study compares four free and open source Data Mining tools: KNIME, Orange, RapidMiner and Weka. Our objective is to reveal the most accurate tool and technique for the classification task. Analysts may use the results to rapidly achieve a good result. Our experimental results show that there is no single tool or technique that always achieves the best result but some achieve better results more often than others.


C3S2E’13 - International C* Conference on Computer Science & Software Engineering 2013


