I'm having trouble extracting certain matches from my character vector that I named classes using the stringr
library:
classes = read_lines("https://statistics.ucdavis.edu/courses/descriptions-undergrad") %>%
str_flatten()
A little snippet of classes
:
...collaborative data analysis; complete case study review and team data analysis project.
Effective: 2019 Fall Quarter.</p><h2>STA 190X—Seminar (1-2)</h2><p>Seminar—1-2 hour(s). Prerequisite(s):
STA 013 or STA 013Y or STA 032 or STA 100 or STA 103. In-depth examination of a special topic in a small
group setting. Effective: 2018 Spring Quarter.</p><h2>STA 192—Internship in Statistics (1-12)</h2>
<p>Internship—3-36 hour(s); Term Paper...
I can clearly see that the word "STA 190X" is in my vector, but I can't seem to extract it:
>str_detect(classes, "STA 190X")
[1] FALSE
>str_extract_all(classes, "STA 190X")
[[1]]
character(0)
But if I copy and paste a section directly into the function, it works:
> str_detect("</p><h2>STA 190X—Seminar (1-2)</h2>", "STA 190X")
[1] TRUE
> str_extract_all("</p><h2>STA 190X—Seminar (1-2)</h2>", "STA 190X")
[[1]]
[1] "STA 190X"
Anyone know why this is?