https://www.kaggle.com/rounakbanik/the-movies-dataset this is the dataset I was using.
I tried the following code in R:
toJSON(movies) %>% validate()
json_to_df <- function(df, column){
column_1 <- df[apply(df[,column],1,nchar)>2,] #Checks if there is an entry
list_1 <- lapply(column_1[[column]], fromJSON) #Converts the JSON to a list
values <- data.frame(unlist(lapply(list_1, function(x) paste(x$name,collapse = ",")))) #Collapsing all the values of the list as a key value pair
final_df <- cbind(column_1$id, column_1$title, values) #new data frame with the key and values a s columns
names(final_df) <- c("id", "title", column)
return(final_df)
}
#Calling the json_to_df() to generate the dataframes for all the JSON Columns
library("df2json")
genres_df %>% fromJSON(genre_csv)
keywords_df <- parse_json_to_df(keywords, "keywords")
prod_cntry_df <- json_to_df(movies, "production_countries")
prod_cmpny_df <- json_to_df(movies, "production_companies")
spoken_lang_df <- json_to_df(movies, "spoken_languages")
this is the error I got: Error in json2df(movies, "genres") : unused argument ("genres")
I'm trying to run a logistic regression. Thus, would like to look at how these small detail affect the movie's budget or production companies.