Abstract:
Nowadays, classification in machine learning serves as a valuable tool for extracting and analyzing
real-world datasets. However, an important issue in classification is the problem of class
imbalance, which significantly impacts the performance of classifiers.
In 2019, a novel approach for a decision tree induction was introduced to address
This problem—the Minority Condensation Entropy (MCE) measure that can effectively
handle imbalanced datasets. Subsequently, in 2021, a new outlier factor called the Mass ratio - variance Outlier Factor (MOF) was presented that can rank instances based on the dataset density.
This thesis proposes a random forest algorithm using quartile-pattern
Bootstrapping that incorporates MOF and MCE building a random forest capable of handling binary
Class imbalanced datasets. The experimental results on synthesized datasets and real-world datasets indicated that the proposed algorithm outperforms other existing algorithms
in terms of Precision, Recall, F-measure, and geometric mean, showing its effectiveness in handling imbalanced datasets and improving classification accuracy.