Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201867

find JSON in string with recursion limit in r (windows)

$
0
0

I want to build a function that extracts jsons from strings in a generic way (for variable string Formats) with R on Windows.

Thanks to #SO I am using:

allJSONS <- gregexpr(
  pattern = "\\{(?:[^{}]|(?R))*?\\}",
  perl = TRUE,
  text = jsonString
) %>%
  regmatches(x = jsonString)

This works for some strings very well. For others the function Fails with a warning.

Error:

For some strings i get a warning / error:

Warning message: In gregexpr(pattern = "\{(?:[^{}]|(?R))*?\}", perl = TRUE, text = jsonString) : recursion limit reached in PCRE for element 1 consider increasing the C stack size for the R process

The Question was answered for Linux here: Error: C stack usage is too close to the limit. In the comments it was advised to ask a new Question with the Windows tag.

Reproducible example:

I uploaded Sample data on Github: https://github.com/TyGu1/findJSON/raw/master/jsonString.RData. (Direct download via load(url(…))) Fails for me somehow, but Manual download and using load() works for me.

(Note this is only sample data. I am looking for a generic solution.)

load(DOWNLOADED FILE)
allJSONS <- gregexpr(
  pattern = "\\{(?:[^{}]|(?R))*?\\}",
  perl = TRUE,
  text = jsonString
) %>%
  regmatches(x = jsonString)

Proof, that there is actually a JSON:

library(magrittr)  
library(jsonlite)

rp <- gsub(pattern = "memmCellmemm(", fixed = TRUE, replacement = "", x = jsonString)
rp <- substring(rp, first = 1, last = nchar(rp)-1) 
json <- rp %>% fromJSON

Goal:

Build a function that extracts jsons from strings in a generic way (for variable string Formats) with R on Windows.

I am Aware that i can extract the json with the provided Code:

rp <- gsub(pattern = "memmCellmemm(", fixed = TRUE, replacement = "", x = jsonString)
rp <- substring(rp, first = 1, last = nchar(rp)-1) 

but i would Need a more generic function, like the regex at the top, because the file Formats might be quite different across Input data.


Viewing all articles
Browse latest Browse all 201867

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>