So here is my code
ny <- read.csv2("nyt.csv", sep = "\t", header = T)
ny_texte <- as.vector(ny)
iterator <- itoken(ny_texte,
preprocessor=tolower,
tokenizer=word_tokenizer,
progressbar=FALSE)
vocabulary <- create_vocabulary(iterator)
My .csv is articles from the new york times. I would like to combine words like "new york", "south africa", "ellis island" in vocabulary and not just have token like this : "new" , "york", etc
How can I do this ?
Thank You