Scrap free text that has changing number of elements in different websites

I am scrapping a database of companies available in this format, where each company is under a different website, defined by the number at the end of the url (example above is 15310; see url). I am using rvest.

I want to extract all entries shown under "Organización". Each variable name is in bold, followed by the value in normal text. In the example above there are 16 variables to extract.

There are two problems with these websites:

the normal text is not inside an element (whereas variable names are). In effect, code goes something like this (notice "Value" is outside label):

<div class="form-group">
         <label for="variable_code">variable_name</label>
         Value
     </div>

not all companies have the same number of variables (16 in the example above). Some have more, others have less. Still, variable_code and variable_name are the same throughout the database.

I can think of two options to scrap the data. One is to scrap based on fixed positions. For this, I can use "nth-child" type of CSS selectors to get each variable. However, because number of variables change across companies, I need to save both the variable name and value as R variables. This is shown in the code below (for one website; for more just need to add loop, irrelevant here):

library(xml2)
library(rvest)
library(stringr)

url <- "https://tramites.economia.gob.cl/Organizacion/Details/15310"

webpage <- read_html(url) #read webpage

title_html <- html_nodes(webpage, "body > div > div:nth-child(5) > div > div:nth-child(1) > div:nth-child(3)") # this selects by element in division, after which I extract both the variable name and value as elements. Ideally, you want only the value, to allocate to a variable in a dataframe.

title <- html_text(title_html)

variable_name <- trimws(strsplit(title, "\r\n")[[1]][2])
value <- trimws(strsplit(title, "\r\n")[[1]][3])

So, the above works, but it's time consuming, since it saves variable names as variables, after which I need to transform the data.

Another option is to scrap based on labels. This is, to search for each variable in the code and get its value. Something like:

title_html <- html_nodes(webpage, "body > div > div:nth-child(5) > div > div:nth-child(1) label[for=RazonSocial]")

The problem with this approach is that the value of each variable is a free text (i.e. outside a specific element). Thus, it cannot be obtained through CSS selectors, as explained in many places (e.g. here, here, or here). Evidently, I cannot change the html code.

What can I do to improve the scrapping process? Am I stuck with the brute force, first method, extracting everything as variables? Or can I somehow gain efficiency?

PS: one way I was thinking of is to somehow get the position where the label is found using the second method and then get the value using the first. But I doubt R has this option (like address or cell in excel).

Scrap free text that has changing number of elements in different websites

Trending Articles

Lady Gaga – MAYHEM (Bonus Tracks Version) [iTunes Rip M4A]

Black Angus Grilled Artichokes

Griffith faces three more offences

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

How to set Page Break/ Page Reset in Smartforms using Events

Mp3 Download: Mdu - Mazola

WALLACE; JACQUELINE

Read GOS (Generic Object Service) Picture Attachments and Display it into...

Xfer Records Nerve v1.1.2 MacOSX Incl Keygen-R2R

Hucknall burglar's attempt to break into woman's home foiled

Aaron Haywood – Hyde

The Ultimate Doors Discography - 90 Albums - All MP3's

Practice Sheet of Right form of verbs for HSC Students

NCERT Solutions for Class 9th Sanskrit Chapter 2 अविवेकः परमापदां पदम्

Motrex mtxm100ja

Telangana TS New Food Security Card/ Telangana Ration card Application Form...

ROBERT F TOSTA Arrested by Miami-Dade County Corrections on Nov 12, 2016

Trial of East Grinstead man accused of rape to begin next week

Halestorm – Everest – Pre-Single [iTunes Plus M4A]

Schools benefit from American donation