Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 205466

Scraping html table and its href Links in R when there are more than one table and particularities

$
0
0

My question is actually the same as the one asked here : Scraping html table and its href Links in R

But the solution provided does not work in my case...or there is something I didn't understand... In my case, the webpage has more than a table and I don't know how to target a specific table with the solution provided in the other question...

For example in this webpage https://en.wikipedia.org/wiki/UEFA_Champions_League, how would I focus on the table "All time top scorers"? How would I get the links for the columns "Player","Country" and "Club(s)"?

I tried something like

links = read_html("https://en.wikipedia.org/wiki/UEFA_Champions_League") %>% 
  html_nodes(xpath = '//*[@id="mw-content-text"]/div/table[5]')%>% 
  html_nodes(xpath = '//td/a')%>% html_attr("href") 

But it keeps giving me other links.

Besides, there is another difficulty that some names are in bold here and some are not...


Viewing all articles
Browse latest Browse all 205466

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>