Im trying to make a function that allows me to plot MDS's (using ggplot2
) from several data frames within a list, but my obstacle is those data frames have different groups definitions.
I want to know how to state a condition before rebind
or in ggplot2
parameters definition (scale_color_manual
) to let the function choose only the groups that match with the factors on the data frame is computing at the moment.
More clear, i have 11 data frames named as their geographic location. Each of them is also are categorized by season of the year (Observation point), e.g. Data frame "Ag" has 4 levels in the factor column: "Ag_O" as fall, "Ag_I" as winter, "Ag_P" as spring and "Ag_V" as summer. In the same way i have data from others places as "Co", "Um", and more...
After computed everything needed to plot a nice MDS (mainly "sites"
and "species"
scores
), I define the groups to use in plots as:
Note: nmds.pq.scores
looks generic because is a product from the first part of the function and it is to get site scores from each data frame.
grp.ago<-nmds.pq.scores[nmds.pq.scores$grp == "Ag_O", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Ag_O", c("NMDS1", "NMDS2")]), ]
grp.agi<-nmds.pq.scores[nmds.pq.scores$grp == "Ag_I", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Ag_I", c("NMDS1", "NMDS2")]), ]
grp.agp<-nmds.pq.scores[nmds.pq.scores$grp == "Ag_P", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Ag_P", c("NMDS1", "NMDS2")]), ]
grp.agv<-nmds.pq.scores[nmds.pq.scores$grp == "Ag_V", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Ag_V", c("NMDS1", "NMDS2")]), ]
grp.coo<-nmds.pq.scores[nmds.pq.scores$grp == "Co_O", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Co_O", c("NMDS1", "NMDS2")]), ]
grp.coi<-nmds.pq.scores[nmds.pq.scores$grp == "Co_I", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Co_I", c("NMDS1", "NMDS2")]), ]
grp.cop<-nmds.pq.scores[nmds.pq.scores$grp == "Co_P", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Co_P", c("NMDS1", "NMDS2")]), ]
grp.cov<-nmds.pq.scores[nmds.pq.scores$grp == "Co_V", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Co_V", c("NMDS1", "NMDS2")]), ]
grp.umo<-nmds.pq.scores[nmds.pq.scores$grp == "Um_O", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Um_O", c("NMDS1", "NMDS2")]), ]
grp.umi<-nmds.pq.scores[nmds.pq.scores$grp == "Um_I", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Um_I", c("NMDS1", "NMDS2")]), ]
grp.ump<-nmds.pq.scores[nmds.pq.scores$grp == "Um_P", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Um_P", c("NMDS1", "NMDS2")]), ]
grp.umv<-nmds.pq.scores[nmds.pq.scores$grp == "Um_V", ][chull(nmds.pq.scores[nmds.pq.scores$grp == "Um_V", c("NMDS1", "NMDS2")]), ]
OK, now comes the hard part: how to rbind
the right groups of factors to use in ggplot() according to the file in process of computing. I mean, how do we tell to the function to choose only the groups definitions that match with categories on the data frame under process?
My code follow as:
hull.Group_data<- rbind(grp.ago,grp.agi,grp.agp,grp.agv,grp.coo,grp.coi,grp.cop,grp.cov,grp.umo,grp.umi,grp.ump,grp.umv)
hull.Group_data
NMDS_poligplot<- ggplot() +
geom_polygon(data=hull.Group_data,aes(x=NMDS1,y=NMDS2,fill=grp,group=grp),alpha=0.30) +
geom_text_repel(data=nmds.pq.var.scores,aes(x=NMDS1,y=NMDS2,label=vars),alpha=1) +
geom_point(data=nmds.pq.scores,aes(x=NMDS1,y=NMDS2,shape=grp,colour=grp),size=2) +
scale_colour_manual(values=c("Ag_O"="yellow","Ag_I"="blue","Ag_P"="green","Ag_V"="red",
"Co_O"="yellow","Co_I"="blue","Co_P"="green","Co_V"="red",
"Um_O"="yellow","Um_I"="blue","Um_P"="green","Um_V"="red"))+
coord_equal() +
theme_bw()+
theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
axis.title.x = element_text(size=12),
axis.title.y = element_text(size=12),
panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.background = element_blank())
So, as you see, i rbind
all groups defined. could that really be a problem? because when run the code it gives an error that seems to be related to scale_colour_manual
due its considering all groups stated, and none of the data frames in the list has all of them, instead, as said before, are different dataframes and they are categorized by season respectively as "Ag_O", "Ag_I", "Ag_P" and "Ag_V" for "Ag" place, "Co_O", "Co_I", "Co_P" and "Co_V" for "Co", and so on.
So, in SUM, is there a way to run this MDS function, that also generate pretty MDS plots grouped by factors, using a list with data frames with different groups definitions? I imagine something as
if nmds.pq.scores correspond to AgI, than choose Ag_O, Ag_I, Ag_P and Ag_V groups to use in ggplot scale_color_manual argument
(obviously i need to know how to write that in R)
I uploaded My Data if someone wants to explore more about what im doing. I also put the whole code in there. Thanks!!! I hope to have been enough clear with my situation and goals. Cheers.