Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201977

Extracting lazy loaded content using Selenium?

$
0
0

I'm using selenium to

  • navigate to a url, and
  • scroll down the page so that lazyloaded images load, and
  • extract the resulting lazyloaded content

The resulting HTML should contain the lazy loaded content, but it doesn't.

Notes

  • When I run the code, I don't get the lazyloaded content, but if I simply watch the browser as the code is being executed, I resulting HTML does contant the lazyloaded content(!)
  • when I (human) click anywhere on the webpage in the automated browser window, and then re-extract the page HTML, then I do see the lazy loaded content for that part of the page
  • If I (human) scroll up and down the page, then extract the HTML, I do get all the lazy loaded content.

Question

When a human scrolls to the bottom of the page, lazyloaded data is captured in the extract HTML, but when the same is done programatically, the lazyloaded data is missing. How come?

For reference, here's the R code I'm running

library(RSelenium)
library(dplyr)


url %>% remDr$navigate(.)

webElem <- remDr$findElement("css", "body")


for(i in 1:50) { webElem$sendKeysToElement(list(key = "down_arrow")); Sys.sleep(0.02) } # Scroll down
webElem$click(buttonId = 0); Sys.sleep(0.02) # Click on page

# Repeats above scroll/click 4 more times to get to very bottom of page


# Get html
remDr$getPageSource() %>% .[[1]] %>% .[1] %>% read_html(.)

Viewing all articles
Browse latest Browse all 201977

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>