Currently i am working on a project whose objective is to find the customer who has more probability to purchase your project.Its a classification model (0 & 1 ). I have created model with RF and XGB both & calculated gain score ( Data is imbalanced ).Not my more than 80 % customers covering in top 3 decile for training data but when i run the model on validation dataset, it fall back to 56-59 % in both model.
Say i have 20 customers & for better accuracy , i have clustered them, Now model is giving perfect result on cluster 1 customers but perform poor on cluster 2 customers.
Any suggestion to tune the same.