I have a data set that has a poor naming convention in it and I'm struggling to find a way to automate the name changing process. An example of the data is shown below:
x1 <- rnorm(10)
x2 <- rnorm(10)
y <- rnorm(10)
x11 <- rnorm(10)
x3 <- rnorm(10)
y1 <- rnorm(10)
x21 <- rnorm(10)
x31 <- rnorm(10)
data <- data.frame(x1, x2, y, x11, x3, y1, x21, x31)
head(data,2)
This outputs a data frame that looks like this:
x1 x2 y x11 x3 y1
1 -0.9071106 0.6852567 0.7185932 -0.1943458 1.71832739 0.1568951
2 -0.4592129 -0.3567014 -0.3137624 0.9683101 -0.15601160 0.8513820
x21 x31
1 0.6160399 -1.3877095
2 -1.0286380 -1.6583842
What I'm trying to do is change the name of each x-column to the to the first number that appears beside the x. For example, column x11 should just read x1... and column x21 should be just x2. I could achieve this by manually changing each name by doing something like this:
names(data)[startsWith(names(data), "y")] <- "y"
names(data)[startsWith(names(data), "x1")] <- "x1"
names(data)[startsWith(names(data), "x2")] <- "x2"
names(data)[startsWith(names(data), "x3")] <- "x3"
head(data,2)
Which outputs:
x1 x2 y x1 x3 y
1 -0.9071106 0.6852567 0.7185932 -0.1943458 1.7183274 0.1568951
2 -0.4592129 -0.3567014 -0.3137624 0.9683101 -0.1560116 0.8513820
x2 x3
1 0.6160399 -1.387709
2 -1.0286380 -1.658384
But I'm struggling to write a function to do this over the entire dataset. Also, I realise that this will result in having multiple x1, x2 (etc) columns... but for my purposes, I need the data like this.
Any suggestions as to how id write this function?