I know similar question might have asked in this/different forum but I feel my requirement is different. I have 2 columns dataframe as shown in below:
VerbatimLowestlevelTerm
Acute Bronchitis Acute Bronchitis
Sinusitis Maxillaris Acuta Acute Maxillary Sinusitis
Increase In Eosinophils Eosinophil Count Increased
Bronchitis Acuta Bronchitis Acute
Acute Sinusitis Maxillaris Acute Sinusitis, Maxillary
Eosinophil Increase Eosinophil Count Increased
Increase In Eosinophilia Eosinophilia
I am trying to get the below output with my code but I am not finding any luck
Verbatim LowestlevelTerm Cluster id
Acute Bronchitis Acute Bronchitis 1
Bronchitis Acuta Bronchitis Acute 1
Sinusitis Maxillaris Acuta Acute Maxillary Sinusitis 2
Acute Sinusitis Maxillaris Acute Sinusitis, Maxillary 2
Increase In Eosinophils Eosinophil Count Increased 3
Eosinophil Increase Eosinophil Count Increased 3
Increase In Eosinophilia Eosinophilia 3
Code which I am using to fulfil my requirement
new_df <- df %>%
group_by(LowestlevelTerm) %>%
summarise(Clusterid = toString(ID))
Could you please let me know if there any simple way to cluster this terms using any other functions?