I'm trying to scrape IMDB website using CSS selector and xpath. For some film there are missing data, therefore it is not possible to create a dataframe.
I would like to find a solution in order to fill empty nodes with Na. I thought to use the content variable to count film (it contains all the other variables) and then if there is an information that I need (suppose metadata) find a way to write the information, otherwise make R write Na.
content <- function(url){
url %>% read_html() %>%
html_nodes(".lister-item-content")%>%
html_text()
})
metascore <- function(url){
url %>% read_html() %>%
html_nodes("span.metascore") %>%
html_text()
})
Do you have any suggestion?