Abstract:
Recently, the large data and streaming chunk data classification problems are the interesting and challenging problems in many real world applications such as finance, medical diagnosis, pattern recognition, and data mining. In most cases, a complete set of database for building a classifier is not provided in advance. In this work, the Data-throwaway Learning for Streaming Chunk data classification (DLSC) by applying a Versatile Elliptic Basis Function (VEBF) to single-class-wise computation is proposed. The proposed learning method is based on incremental learning and one-pass-thrownaway learning concepts. In this work, the experiment is conducted in two scenarios based on the pattern of given training data including complete training data and streaming training data. The experimental results of the proposed method are compared with both of batch learning and incremental learning algorithms on various data sets with different sizes from 150 to 581,012 samples and attributes from 4 to 1,558. The experimental results show that the DLSC yields the highest classification accuracies In most cases with faster incremental learning, fewer number of used hidden neurons and more flexible structure than the compared methods. The proposed method is suitable for coping with big data classification problem and handling streaming data as well.