Abstract – In Telecommunication industry, churn prediction is loss of customers and faces fierce competition to retain customers. Churn is the phenomena of a customer leaving a business, and in this context, churn prediction refers to predicting the client's intention to leave. In order to retain customers company needs a good churn prediction model. For a churn prediction model, company needs to predict why customer have churned in past and which factor is most important to predict customers who are near churn. This paper primarily focused on the feature importance and feature engineering for churn prediction model. For classification phase two ensemble models, Random forest and Gradient boosted trees were used. This paper also emphasised on why feature importance and feature engineering are important prediction. where, this paper includes various data pre-processing steps that played an important role in this model. This model uses Cell2Cell dataset of size 3333 subscribers and 57 features. This study presented a very good comparison between the model developed in the study with old models. The implementation part has been done using python and apache spark, that are very good platform for data analysis using machine learning and data mining. For improved performance and effective outcomes Hyper parameter optimization using a grid is used. Prediction performance is evaluated for accuracy, Confusion matrix before and after grid based hyper parameter optimisation. The model out performed and achieved 95% accuracy using Random Forest and 97% accuracy using gradient boosted trees.
Alan : Mühendislik
Dergi Türü : Uluslararası
Benzer Makaleler | Yazar | # |
---|
Makale | Yazar | # |
---|