Customer churn is one of the major problems for large companies, especially in banking and telecommunication. Recently, telecommunication companies tend to prevent customer churn since the cost of gaining new customers is more than retaining existing customers. Therefore, the companies would like to have to determine potential churns using different prediction methods such as machine learning algorithms. XGBoost, Adaptive, and Gradient Boosting algorithms are widely used as supervised machine learning methods. Although boosting algorithms are known as superior algorithms in comparison with other machine learning methods, the performances of these models can be greatly affected when the data set is highly imbalanced. In the study, the data set including 26.4% churned customers were considered for the study to evaluate Boosting algorithms. Features are consist of the variables which can be related to the churn decision of the customers such as gender, online security, internet service, online backup, etc. Firstly, Exploratory data analysis was applied to understand the distribution of customers in terms of the related features. Then, the Adaptive-oversampling method was used to eliminate the imbalanced data problem. Lastly, in order to evaluate prediction results of the compared algorithms accuracy, precision, and F1 metrics were calculated for the prediction results. 10 fold cross-validation was also applied in order to validate accuracy results.
Benzer Makaleler | Yazar | # |
---|
Makale | Yazar | # |
---|