I'm working on a modeling problem using multiple regression in R. I'm having trouble wrapping my head around how to build a linear model to capture a certain kind of data.
My data consists of three variables: Year, Demand (in units), Species. The data represents results for a single state in a country.
Here's my dataset:
library(tidyverse)
dat <- tibble(
Year = c(2000, 2001, 2002, 2003, 2004, 2005),
Demand = c(14, 63, 55, 78, 34, 19),
Species = c("Cat", "Dog", "Cat", "Dog", "Cat", "Dog"))
I'm attempting to predict the amount of demand on the state's market for the given species like this:
m1 <- lm(Demand ~ Species, data = dat)
However, I want to add another variable to the dataset titled MarketFavorability
which is how favorable the country's total market is towards the species in question as measured by a bump in demand for that species. Basically, this predicts the state's demand based on the country's overall demand change.
Here's my question: How does one code for that in a linear model? Would the answer be to create another variable that categorically names whether the market is favorable for a certain species and the "push" in demand that's expected from that favorability? That might look something like this:
dat$MarketFavorability <- c(2, 5, 8, 4, 3, 9) # Represents the national bump in demand for the species.
dat$MarketFavorabilityDirection <- c(Dog, Cat, Cat, Cat, Dog, Cat) # Represents the bump's "direction".
And then the linear model would look like this:
m2 <- lm(Demand ~ Species + MarketFavorability + MarketFavorabilityDirection, data = dat)
Am I doing this right? Any recommendations for the best way to capture fluctuations in the market overall in my linear model? Thanks in advance.