I have GPS data in multiple .csv files that I have imported with the following code:
library(readr)
library(tidyverse)
# Data import from target folder
filelist <- list.files("data", pattern = "*.csv")
filenames <- paste(mgsub::mgsub(filelist,
c("_", "samples.csv", "[[:digit:]]+"),
c("", "", "")), sep = "")
setwd("data")
data <- sapply(filelist,
read_csv,
col_types = cols(Uhrzeit = col_time(format = "%H:%M:%OS"),
Uhrzeit_1 = col_time(format = "%H:%M:%OS")),
simplify = FALSE)
names(data) <- filenames
colnames <- c("Aufnahmezeit",
"Uhrzeit",
"Herzfrequenz [S/min]",
"Geschwindigkeit [km/h]",
"Distanz [m]",
"Beschleunigung [m/s²]",
"Schrittfrequenz")
data <- lapply(data, setNames, colnames)
This returns multiple dataframes (currently 5) of roughly 70000 rows each (see one example below):
str(data)
List of 5
$ MaxBauer :Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 69012 obs. of 7 variables:
..$ Aufnahmezeit : 'hms' num [1:69012] 00:00:00.0 00:00:00.1 00:00:00.2 00:00:00.3 ...
.. ..- attr(*, "units")= chr "secs"
..$ Uhrzeit : 'hms' num [1:69012] 12:54:13.0 12:54:13.1 12:54:13.2 12:54:13.3 ...
.. ..- attr(*, "units")= chr "secs"
..$ Herzfrequenz [S/min] : num [1:69012] NA NA NA NA NA NA NA NA NA NA ...
..$ Geschwindigkeit [km/h]: num [1:69012] 0 0 0 0 0 0 0 0 0 0 ...
..$ Distanz [m] : num [1:69012] 0 0 0 0 0 0 0 0 0 0 ...
..$ Beschleunigung [m/s²] : num [1:69012] 0 0 0 0 0 0 0 0 0 0 ...
..$ Schrittfrequenz : num [1:69012] NA NA NA NA NA NA NA NA NA NA ...
..- attr(*, "spec")=
.. .. cols(
.. .. Uhrzeit = col_time(format = "%H:%M:%OS"),
.. .. Uhrzeit_1 = col_time(format = "%H:%M:%OS"),
.. .. `HF [S/min]` = col_double(),
.. .. `Geschwindigkeit [km/h]` = col_double(),
.. .. `Distanz [m]` = col_double(),
.. .. `Beschleunigung [m/s²]` = col_double(),
.. .. Schrittfrequenz = col_double()
.. .. )
I would now like to subset the data using the renamed column "Uhrzeit" as the reference point. I tried the following:
lapply(data, subset(Uhrzeit >= 46000))
That returned this error:
Error in subset.default(data, Uhrzeit >= 46000) :
object 'Uhrzeit' not found
I gather that I need to create a list for the lapply function to work with, e.g. as.list(data), but couldn't get that to work either.
Any help would be greatly appreciated!