How to fix an HTTP 403 error when webscrapping in R?

I'm trying to gather statistical data on 3,600 + Wikipedia pages for work. I am trying to automate it using web scrapping in R.

I have an issue extracting the HTML code directly in R.

download_html("xtools.wmflabs.org/articleinfo/fr.wikipedia.org/1re_Convention_nationale_acadienne")

And this is what the console tells me:

download_html("xtools.wmflabs.org/articleinfo/fr.wikipedia.org/1re_Convention_nationale_acadienne")
Error in curl::curl_download(url, file, quiet = quiet, mode = mode, handle = handle) : HTTP error 403.

What would be a possible reason this isn't working?

When I save the HTML as a file and run it through R, everything works perfectly and I get to make a dataframe with the results:

# ID webpage link first
setwd("C:\\Users\\judit\\Scraping dans R")
webpage <- read_html("HTML_1e.html")
# read_html("https://xtools.wmflabs.org/articleinfo/fr.wikipedia.org/1re_Convention_nationale_acadienne?uselang=fr")


# Statistiques: extraction ----

# Stats: titre
titre <- html_nodes(webpage, ".back-to-search+ a")
titre <- html_text(titre, trim=TRUE)

# Stats: Taille de page
taille <- html_nodes(webpage, ".col-lg-5 tr:nth-child(3) td+ td")
taille <- html_text(taille, trim=TRUE)

# Stats: Total des modifications
mod <- html_nodes(webpage, ".col-lg-5 tr:nth-child(4) td+ td")
mod <- html_text(mod, trim=TRUE)

# Stats: Nombre de redacteurs
red <- html_nodes(webpage, ".col-lg-5 tr:nth-child(5) td+ td")
red <- html_text(red)

# Stats: Evaluation
evaluation <- html_nodes(webpage, ".col-lg-5 tr:nth-child(6) td+ td")
evaluation <- html_text(evaluation, trim=TRUE)

# Stats: Liens vers cette page
liens_vers <- html_nodes(webpage, ".stat-list--group tr:nth-child(2) a")
liens_vers <- html_text(liens_vers, trim=TRUE)

# Stats: Liens depuis cette page
liens_depuis <- html_nodes(webpage, ".col-lg-offset-1 .stat-list--group tr:nth-child(4) td+ td")
liens_depuis <- html_text(liens_depuis, trim=TRUE)

# Stats: Mots
mots <- html_nodes(webpage, ".col-lg-3 tr:nth-child(3) td+ td")
mots <- html_text(mots, trim=TRUE)

wikipedia <- data.frame(titre, taille, red, mod, evaluation, liens_vers, liens_depuis)

Any advice is greatly appreciated! PS: Pardon my French in the code. It's my first language.

How to fix an HTTP 403 error when webscrapping in R?

Trending Articles

LAG, Lacp configuration on Mellanox switches

Karimnagar District Police Office Mobile Numbers List in Telangana State

Ifield Avenue closed following crash in Langley Green

NCERT Solutions for Class 9th Sanskrit Chapter 2 अविवेकः परमापदां पदम्

Skint TV teen to be sentenced

Shatta Wale – You Shock Me (Prod. by Willis Beatz)

09g927750** 6 speed transmission TCM VAG original firmware files

Electronic Bank Statement field Assignment (ZUONR) missing alphabets from...

गर्मी पर स्टेटस – Funny Summer Status in Hindi for Whatsapp

Karnataka SSLC 10th Exam Time Table 2016 (www.kseeb.kar.nic.in)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Griffith faces three more offences

Stories • Goddess Stepmom

Practice Sheet of Right form of verbs for HSC Students

Black Angus Grilled Artichokes

Moondru Mudichu 19-09-2017 – Polimer tv Serial

Parris out on $9,000 bail

TASK ERROR: storage migration failed: block job (mirror) error:...

The 10 Wyoming Cities With The Largest Black Population For 2021

More things we have to put up with: when NOT to raise hell with Disclosure