I'm using selenium to
- navigate to a url, and
- scroll down the page so that lazyloaded images load, and
- extract the resulting lazyloaded content
The resulting HTML should contain the lazy loaded content, but it doesn't.
Notes
- When I run the code, I don't get the lazyloaded content, but if I simply watch the browser as the code is being executed, I resulting HTML does contant the lazyloaded content(!)
- when I (human) click anywhere on the webpage in the automated browser window, and then re-extract the page HTML, then I do see the lazy loaded content for that part of the page
- If I (human) scroll up and down the page, then extract the HTML, I do get all the lazy loaded content.
Question
When a human scrolls to the bottom of the page, lazyloaded data is captured in the extract HTML, but when the same is done programatically, the lazyloaded data is missing. How come?
For reference, here's the R code I'm running
library(RSelenium)
library(dplyr)
url %>% remDr$navigate(.)
webElem <- remDr$findElement("css", "body")
for(i in 1:50) { webElem$sendKeysToElement(list(key = "down_arrow")); Sys.sleep(0.02) } # Scroll down
webElem$click(buttonId = 0); Sys.sleep(0.02) # Click on page
# Repeats above scroll/click 4 more times to get to very bottom of page
# Get html
remDr$getPageSource() %>% .[[1]] %>% .[1] %>% read_html(.)