مشخصات پژوهش

صفحه نخست /Addressing the ...
عنوان
Addressing the class-imbalance and class-overlap problems by a metaheuristic-based under-sampling approach
نوع پژوهش مقاله چاپ شده
کلیدواژه‌ها
Imbalanced classification, Imbalanced datasets, Class overlap, Class imbalance, Metaheuristic algorithms, Under-sampling.
چکیده
The problem of imbalanced class distribution in real-world datasets severely impairs the performance of classification algorithms. The learning task becomes more complicated and challenging when there is also the class-overlap problem in imbalanced data. This research tackles these problems by presenting an under-sampling approach based on a metaheuristic method in which the under-sampling problem is mapped into an optimization problem. The proposed approach aims to select an optimal subset of the majority samples to handle the imbalanced and the class-overlap problems simultaneously while avoiding the excessive elimination of majority samples, especially in overlapped regions. The quality of the generated solutions is evaluated by a classifier and optimized in an evolutionary process. Unlike most existing under-sampling methods, the majority samples are not removed only from the overlapped regions; the classifier performance determines the desired regions for eliminating the majority samples. Extensive experiments conducted on 66 synthetic and 24 real-world datasets with different imbalance ratios and overlapping degrees and two large high-dimensional datasets show a significant performance improvement from the proposed method compared to the competitors.
پژوهشگران پریا سلطان زاده (نفر اول)، محمدرضا فیضی درخشی (نفر دوم)، مهدی هاشم زاده (نفر سوم)