This should REALLY work but it doesn't and I lose my mind!
This is my data
> head(dataset_2,n=5)
CUSTOMER_NUMBER OLD_NEW_CLIENT COMPLETION_PRCT CRASH_RISK
1 535961675 Old client 0.06 25
2 223186690 Old client 0.04 24
3 217140964 Old client 0.05 32
4 514559839 Old client 0.10 52
5 10991413 Old client 0.53 15
> str(dataset_2)
'data.frame': 90405 obs. of 4 variables:
$ CUSTOMER_NUMBER: int 535961675 223186690 217140964 514559839 10991413 506839750 15102896 34980927 578647941 804552857 ...
$ OLD_NEW_CLIENT : chr "Old client""Old client""Old client""Old client" ...
$ COMPLETION_PRCT: num 0.06 0.04 0.05 0.1 0.53 0.05 0.06 0.06 1 0.09 ...
$ CRASH_RISK : num 25 24 32 52 15 38 42 42 41 78 ...
- attr(*, ".internal.selfref")=<externalptr>
I want to summarise count of clients by all other columns - so combinations of old_new_client, completion_prct and crash_risk and a count of clients falling into this bucket. But when I type code:
by_parameters <-dataset_2 %>%
group_by(OLD_NEW_CLIENT, COMPLETION_PRCT, CRASH_RISK) %>%
summarize(clients=n_distinct(CUSTOMER_NUMBER))
I get:
> by_parameters
clients
1 90399
Thanks for any help!