Fast-Decision SVM Ensemble Text Classifier Using Cluster Computing



Text classification is a complex ubiquitous task that involves processing huge amounts of data. Nowadays people focus their attention on systems that can promptly and correctly provide tools to automatically categorize texts, like web pages or emails.
Our approach presents a fast-decision Support Vector Machine (SVM) text classifier distributing the task in a cluster environment and conjugating a set of classifiers with an ensemble strategy. Results with Reuters-21578 data set show the potential improvement in processing time and classification performance.


Text mining; SVM;GRID Computing

ICNPSC\'06, August 2006

