چکیده
|
Customer churn is a main concern of most firms in all industries. The aim of customer churn prediction is detecting
customers with high tendency to leave a company. Although, many modeling techniques have been used in the field of churn
prediction, performance of ensemble methods has not been thoroughly investigated yet. Therefore, in this paper, we perform a
comparative assessment of the performance of four popular ensemble methods, i.e., Bagging, Boosting, Stacking, and Voting
based on four known base learners, i.e., C4.5 Decision Tree (DT), Artificial Neural Network (ANN), Support Vector Machine
(SVM) and Reduced Incremental Pruning to Produce Error Reduction (RIPPER). Furthermore, we have investigated the
effectiveness of two different sampling techniques, i.e., oversampling as a representative of basic sampling techniques and
Synthetic Minority Over-sampling Technique (SMOTE) as a representative of advanced sampling techniques. Experimental
results show that SMOTE doesn’t increase predictive performance. In addition, the results show that the application of
ensemble learning has brought a significant improvement for individual base learners in terms of three performance indicators
i.e., AUC, sensitivity, and specificity. Particularly, in our experiments, Boosting resulted in the best result among all other
methods. Among the four ensemble methods Boosting RIPPER and Boosting C4.5 are the two best methods. These results
indicate that ensemble methods can be a best candidate for churn prediction tasks.
|