Research Specifications

Home \Addressing the ...
Title
Addressing the class-imbalance and class-overlap problems by a metaheuristic-based under-sampling approach
Type of Research Article
Keywords
Imbalanced classification, Imbalanced datasets, Class overlap, Class imbalance, Metaheuristic algorithms, Under-sampling.
Abstract
The problem of imbalanced class distribution in real-world datasets severely impairs the performance of classification algorithms. The learning task becomes more complicated and challenging when there is also the class-overlap problem in imbalanced data. This research tackles these problems by presenting an under-sampling approach based on a metaheuristic method in which the under-sampling problem is mapped into an optimization problem. The proposed approach aims to select an optimal subset of the majority samples to handle the imbalanced and the class-overlap problems simultaneously while avoiding the excessive elimination of majority samples, especially in overlapped regions. The quality of the generated solutions is evaluated by a classifier and optimized in an evolutionary process. Unlike most existing under-sampling methods, the majority samples are not removed only from the overlapped regions; the classifier performance determines the desired regions for eliminating the majority samples. Extensive experiments conducted on 66 synthetic and 24 real-world datasets with different imbalance ratios and overlapping degrees and two large high-dimensional datasets show a significant performance improvement from the proposed method compared to the competitors.
Researchers (First Researcher)، M. Reza Feizi-Derakhshi (Second Researcher)، Mahdi Hashemzadeh (Third Researcher)