How to extract the heatmap of from a set of gene interest from main heatmap?

February 20, 2020, 2:41 am

≫ Next: Create likert plot with two groups in R

≪ Previous: pivot_longer with groups of columns

I have made a heatmap based on following script.

#load the library
library(pheatmap)

#set label font
par(cex.lab=1.5) # is for y-axis
par(cex.axis=1.5) # is for x-axis

#Set the working directory
setwd("F:\\snijesh\\OneDrive-Snijesh\\GSE4170\\final\\Rcode")


#hedge
#read the expression matrix
hedge <- read.table("hedge.txt",sep="\t",header = TRUE)

#Group the data in the frame
group_df = data.frame(Groups=rep(c("AP","CP","BP"),c(6,42,28)))

#Assign the col name to the matrix
rownames(group_df) = colnames(hedge)[2:ncol(hedge)]

#Assign the rownames
rownames(hedge) = hedge[,1]

#plot the heatmap with specific split
pheatmap(hedge[,-1],cluster_cols=FALSE,
         annotation_col=group_df, gaps_col = cumsum(c(6,42,28)))

I got following heatmap from the code:

Now I want to extract and replot a subplot from the main heatmap using gene set of my interest (CD34, CD117, CD38, SHH, GLI1, GLI2, GLI3, BCL2, PTCH1, PTCH2, SMO). How can I do this?

↧

Create likert plot with two groups in R

February 20, 2020, 2:41 am

≫ Next: What's the biggest R-gotcha you've run across?

≪ Previous: How to extract the heatmap of from a set of gene interest from main heatmap?

I have issues trying to create a likert plot with two groups. I realized a survey in two communities. I now want to compare these two communities. That's my data frame. So far, I have loaded a sheet with 3 columns. One column refers to the location (community1, community2), one includes the answers regarding econ_comm (1-6) and one includes the answers regarding future_persp (1-6). I created a likert-object and a first figure.

g_likert = likert(g[1:6])
plot(g_likert, ordered = FALSE, group.order = names(g[2:3]))

...and it worked. Following is what I got so far.

I assume it's now important to create a both-object first: both<-g$Location (worked)

Now I'm starting to get trouble. The following code shows me errors:

both_likert_2 = likert(both[, c(1:3), drop=FALSE], grouping = both$location)
plot(both_likert_2, include.histogram = TRUE)

The errors are:

Error in [.data.frame (g, 1:6) : undefined columns selected
Error in [.default (both, , c(1:3), drop = FALSE) : wrong number of dimensions - Objekt 'both_likert_2' not found

I have now also attached a screenshot of my R, just to make sure. I'm struggling for quite some time now and I would be very very grateful for some help. Best, Felix

EDIT: Here's my code to reproduce it:

library(likert)
g<-read.csv2("C:/Users/felix/OneDrive/Documents/R/SurveyData2.csv", sep=";", dec=",", header=TRUE)
both<-g$Location
g <-  within(g, {
  gold_21cent <- factor(gold_21cent, levels=1:6, labels=c("Completely agree", "Agree", "Slightly agree", "Slightly disagree", "Disagree", "Completely disagree"))
  future_persp <- factor(future_persp, levels=1:6, labels=c("Completely agree", "Agree", "Slightly agree", "Slightly disagree", "Disagree", "Completely disagree"))
 jobs_comm <- factor(jobs_comm, levels=1:6, labels=c("Completely agree", "Agree", "Slightly agree", "Slightly disagree", "Disagree", "Completely disagree"))
} )
.........etc............
comm_likert = likert(g[,2:14], grouping=g[,1])
plot(comm_likert)
library(dplyr)
g %>%

rename(It offers important economic perspectives=future_persp, It provides economic prosperity to the community=econ_comm) %>% likert(grouping=Location) %>% plot()

↧

What's the biggest R-gotcha you've run across?

February 20, 2020, 2:43 am

≫ Next: dplyr + ggplot2: Plotting not working via piping

≪ Previous: Create likert plot with two groups in R

Is there a certain R-gotcha that had you really surprised one day? I think we'd all gain from sharing these.

Here's mine: in list indexing, my.list[[1]] is not my.list[1]. Learned this in the early days of R.

↧

dplyr + ggplot2: Plotting not working via piping

February 20, 2020, 2:50 am

≫ Next: Extract WORLDCLIM data using R for a single country

≪ Previous: What's the biggest R-gotcha you've run across?

I want to plot a subset of my dataframe. I am working with dplyr and ggplot2. My code only works with version 1, not version 2 via piping. What's the difference?

Version 1 (plotting is working):

data <- dataset %>% filter(type=="type1")
ggplot(data, aes(x=year, y=variable)) + geom_line()

Version 2 with piping (plotting is not working):

data %>% filter(type=="type1") %>% ggplot(data, aes(x=year, y=variable)) + geom_line()

Error:

Error in ggplot.data.frame(., data, aes(x = year,  : 
Mapping should be created with aes or aes_string

Thanks for your help!

↧

Extract WORLDCLIM data using R for a single country

February 20, 2020, 2:51 am

≫ Next: How to save csv or xlsx into S3 bucket?

≪ Previous: dplyr + ggplot2: Plotting not working via piping

I want to extract world climate data for minimum and maximum temperature for only one country India using R and save it as a data set (to use with my own data-set that contains crop yields at the district level). I have gone through several posts and can see that this can be done easily in R, however the posts that I have tried to follow are a bit different in terms of the commands or sequences and I am getting confused. (https://gis.stackexchange.com/questions/259478/worldclim-data-na-for-my-coordinates, https://gis.stackexchange.com/questions/227585/using-r-to-extract-data-from-worldclim

What I have tried to use is as follows.

library(raster)
library(sp)
r<- getData('CMIP5', var='tmin', res=10, rcp=45, model='HE', year=70)
r <- r[[c(1,12)]]
values <- extract(r,points)
df <- cbind.data.frame(coordinates(points),values)
head(df)

However, I can run only the first two lines and the line values

<- extract(r,points) gives the error Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘extract’ for signature ‘"RasterStack", "function"’

Any suggestions?

↧

How to save csv or xlsx into S3 bucket?

February 20, 2020, 2:52 am

≫ Next: nodes labels after deleting in R

≪ Previous: Extract WORLDCLIM data using R for a single country

Even though actual use case is to store the data within shiny session, I have simplified to be clear on the goal.

Could you please help me understand what changes I need to make in order to save the Csv/xlsx into S3 bucket ?

Relevant but pythonsave a csv file into s3 bucket from pypark dataframe

library(tidyverse)
library(aws.s3)

# Authetication
aws.signature::use_credentials()
s3_bucket_link = # My s3 bucket link

  df = fread("mtcars.csv")

  #put object to s3
  aws.s3::put_object(object = "mtcars.csv",
                     file = "mtcars.csv",
                     bucket = s3_bucket_link)
  # Gives 403 or 404 error 
  # But when I check my S3 bucket file isnt available !!!

↧

nodes labels after deleting in R

February 20, 2020, 2:54 am

≫ Next: R summarise_at dynamically by condition : mean for some columns, sum for others

≪ Previous: How to save csv or xlsx into S3 bucket?

I'm working with random graphs where the label of nodes are numbers from 1 to N. In my work, I'm deleting some nodes from the graph. My problem is that in R, after deleting is just renaming the nodes again from 1 to remaining N, how I can preserve the label of nodes after deleting ?? thanks a lot

↧

R summarise_at dynamically by condition : mean for some columns, sum for others

February 20, 2020, 2:54 am

≫ Next: Is there a function to creating a floor time or ceiling time for a character "time" vector formatted 00:00:00

≪ Previous: nodes labels after deleting in R

I would like that but with the conditions in the summarise_at()

edit:

edit 1

I've added the word dynamically in the title: When I use vars(c()) in the summarise_at() it's for fast and clear examples, but in fact it's for use contains(), starts_with() and matches(,, perl=TRUE), because I have 50 columns, with many sum() and some mean().

And the goal is for generate dynamic SQL with tbl()..%>% group_by() ... %>% summarise_at()...%>% collect().

edit 2

I added example with SQL generated in my second example

end edit



library(tidyverse)
(mtcars 
  %>% group_by(carb)
  %>% summarise_at(vars(c("mpg","cyl","disp")), list (~mean(.),~sum(.)))
  # I don't want this line below, I would like a conditionnal in summarise_at() because I have 50 columns in my real case
  %>% select(carb,cyl_mean,disp_mean,mpg_sum)
)
#> # A tibble: 6 x 4
#>    carb cyl_mean disp_mean mpg_sum
#>   <dbl>    <dbl>     <dbl>   <dbl>
#> 1     1     4.57      134.   177. 
#> 2     2     5.6       208.   224  
#> 3     3     8         276.    48.9
#> 4     4     7.2       309.   158. 
#> 5     6     6         145     19.7
#> 6     8     8         301     15

Created on 2020-02-19 by the reprex package (v0.3.0)

This works, but I want only sum for mpg, and only mean for cyl and disp

library(RSQLite)
library(dbplyr)
library(tidyverse)
library(DBI)

db <- dbConnect(SQLite(),":memory:")

dbCreateTable(db, "mtcars_table", mtcars)

(tbl( db, build_sql( con=db,"select * from mtcars_table" ))
  %>% group_by(carb)
  %>% summarise_at(vars(c("mpg","cyl","disp")), list (~mean(.),~sum(.)))
  %>% select(carb,cyl_mean,disp_mean,mpg_sum)                   
  %>% show_query()
)
#> <SQL>
#> Warning: Missing values are always removed in SQL.[...]  to silence this warning
#> SELECT `carb`, `cyl_mean`, `disp_mean`, `mpg_sum`
#> FROM (SELECT `carb`, AVG(`mpg`) AS `mpg_mean`, AVG(`cyl`) AS `cyl_mean`, AVG(`disp`) AS `disp_mean`, SUM(`mpg`) AS `mpg_sum`, SUM(`cyl`) AS `cyl_sum`, SUM(`disp`) AS `disp_sum`
#> FROM (select * from mtcars_table)
#> GROUP BY `carb`)
#> # Source:   lazy query [?? x 4]
#> # Database: sqlite 3.30.1 [:memory:]
#> # … with 4 variables: carb <dbl>, cyl_mean <lgl>, disp_mean <lgl>,
#> #   mpg_sum <lgl>

I tried all possibilities like that but it doesn't work or it produces error.

error

(mtcars %>% group_by(carb)%>% summarise_at(vars(c("mpg","cyl","disp")),ifelse(vars(contains(names(.),"mpg")),list(sum(.)),list(mean(.)))) )

not good, too many columns

library(tidyverse)
(mtcars %>% group_by(carb)%>% summarise_at(vars(c("mpg","cyl","disp")),ifelse ((names(.)=="mpg"), list(~sum(.)) , list(~mean(.)))))
#> # A tibble: 6 x 34
#>    carb mpg_sum cyl_sum disp_sum mpg_mean..2 cyl_mean..2 disp_mean..2
#>   <dbl>   <dbl>   <dbl>    <dbl>       <dbl>       <dbl>        <dbl>
#> 1     1   177.       32     940.        25.3        4.57         134.
#> 2     2   224        56    2082.        22.4        5.6          208.
#> 3     3    48.9      24     827.        16.3        8            276.
#> 4     4   158.       72    3088.        15.8        7.2          309.
#> 5     6    19.7       6     145         19.7        6            145 
#> 6     8    15         8     301         15          8            301 
#> # … with 27 more variables: mpg_mean..3 <dbl>, cyl_mean..3 <dbl>,
#> #   disp_mean..3 <dbl>, mpg_mean..4 <dbl>, cyl_mean..4 <dbl>,
#> #   disp_mean..4 <dbl>, mpg_mean..5 <dbl>, cyl_mean..5 <dbl>,
#> #   disp_mean..5 <dbl>, mpg_mean..6 <dbl>, cyl_mean..6 <dbl>,
#> #   disp_mean..6 <dbl>, mpg_mean..7 <dbl>, cyl_mean..7 <dbl>,
#> #   disp_mean..7 <dbl>, mpg_mean..8 <dbl>, cyl_mean..8 <dbl>,
#> #   disp_mean..8 <dbl>, mpg_mean..9 <dbl>, cyl_mean..9 <dbl>,
#> #   disp_mean..9 <dbl>, mpg_mean..10 <dbl>, cyl_mean..10 <dbl>,
#> #   disp_mean..10 <dbl>, mpg_mean..11 <dbl>, cyl_mean..11 <dbl>,
#> #   disp_mean..11 <dbl>

Some other tries and remarks

I would like conditional sum(.) or mean(.) depending of the name of the column in the summarise().

It could be good if it accepts not only primitive functions.

At then end it's for tbl()..%>% group_by() ... %>% summarise_at()...%>% collect() to generate conditional SQL with AVG() and SUM().

Mssql SQL function like ~(convert(varchar()) works for mutate_at() and similar ~AVG()works for summarise_at() but I arrive at the same point: conditional summarise_at() doesn't work depending of name of columns.

↧

Is there a function to creating a floor time or ceiling time for a character "time" vector formatted 00:00:00

February 20, 2020, 8:01 am

≫ Next: Count and summation of positive and negative number sequences

≪ Previous: R summarise_at dynamically by condition : mean for some columns, sum for others

I am working with a tibble containing call center data, and essentially what I am trying to determine is how many calls come in during a range of hours in 15 minute increments starting from 7:00 am to 7:00 pm. My data set looks like something below.

Eventually I know I can do a group by and summary, but how do I put the Contact Time into a floor or ceiling number?

Example: Anything between 08:00:00 and 08:10:00 would be 08:00:00, Anything between 08:10:01 and 08:20:00 would be 08:15:00, and Anything between 08:20:01 and 08:30:00 would be 08:30:00, so on and so fourth.

tbl <- tibble(ContactDate = "2019-11-01", ContactTime = "08:11:41", Month = "Nov", Day = "Friday", ConnectTimeSecs = "216", QueueTimeSecs = "31", TalkTimeSecs = "110")

↧

Count and summation of positive and negative number sequences

February 20, 2020, 8:02 am

≫ Next: Calculate multiple columns based on threshold

≪ Previous: Is there a function to creating a floor time or ceiling time for a character "time" vector formatted 00:00:00

I want to write a code to count and sum any positive and negative series of numbers.
Numbers are either positive or negative(no zero).
I have written codes with for loops. Is there any creative alternative?

Data

R

set.seed(100)
x <- round(rnorm(20, sd = 0.02), 3)

python

x = [-0.01, 0.003, -0.002, 0.018, 0.002, 0.006, -0.012, 0.014, -0.017, -0.007,

     0.002, 0.002, -0.004, 0.015, 0.002, -0.001, -0.008, 0.01, -0.018, 0.046]

loops

R

sign_indicator <- ifelse(x > 0, 1,-1)
number_of_sequence <- rep(NA, 20)
n <- 1
for (i in 2:20) {
  if (sign_indicator[i] == sign_indicator[i - 1]) {
    n <- n + 1
  } else{
    n <- 1
  }
  number_of_sequence[i] <- n

}
number_of_sequence[1] <- 1

#############################

summation <- rep(NA, 20)

for (i in 1:20) {
  summation[i] <- sum(x[i:(i + 1 - number_of_sequence[i])])
}

python

sign_indicator = [1 if i > 0 else -1 for i in X]

number_of_sequence = [1]
N = 1
for i in range(1, len(sign_indicator)):
    if sign_indicator[i] == sign_indicator[i - 1]:
        N += 1
    else:
        N = 1
    number_of_sequence.append(N)

#############################
summation = []

for i in range(len(X)):
    if number_of_sequence[i] == 1:          
          summation.append(X[i])

    else:
        summation.append(sum(X[(i + 1 - number_of_sequence[i]):(i + 1)]))

result

        x n_of_sequence    sum
1  -0.010             1 -0.010
2   0.003             1  0.003
3  -0.002             1 -0.002
4   0.018             1  0.018
5   0.002             2  0.020
6   0.006             3  0.026
7  -0.012             1 -0.012
8   0.014             1  0.014
9  -0.017             1 -0.017
10 -0.007             2 -0.024
11  0.002             1  0.002
12  0.002             2  0.004
13 -0.004             1 -0.004
14  0.015             1  0.015
15  0.002             2  0.017
16 -0.001             1 -0.001
17 -0.008             2 -0.009
18  0.010             1  0.010
19 -0.018             1 -0.018
20  0.046             1  0.046

↧

Calculate multiple columns based on threshold

February 20, 2020, 8:02 am

≫ Next: Error in r function from using nrow and ncol

≪ Previous: Count and summation of positive and negative number sequences

data

data=data.frame("student"=c(1,2,3,4,5),
"score1"=c(77,NA,52,99,89),
"score2"=c(95,89,79,89,73),
"score3"=c(92,52,73,64,90),
"score4"=c(84,57,78,81,66),
"score1x"=c(0,NA,0,1,1),
"score2x"=c(1,1,0,1,0),
"score3x"=c(1,0,0,0,1),
"score4x"=c(1,0,0,1,0))

I have data with student id and score1-score4 and hope to create score1x-score4x in a simple fast way. The rule is if say score1 is less than 80 then score1x is 0 otherwise it is 1. I can do this by: data$score1x=ifelse(score1<80,0,1) but am wondering is there a way to do this for all of them at the same time to create score1x-score4x more quickly?

I realize I did not ask the most appropriate question given your very helpful answers!

dataWANT=data.frame("student"=c(1,2,3,4,5),
                    "w1"=c(2,2,0,2,1),
                    "w2"=c(2,0,0,2,1),
                    "w3"=c(2,2,0,2,1),
                    "w4"=c(1,0,0,1,2))


dataHAVE=data.frame("student"=c(1,2,3,4,5),
                    "f1"=c(0,0,0,1,1),
                    "c1"=c(1,1,0,1,0),
                    "f2"=c(1,0,0,0,1),
                    "c2"=c(1,0,0,1,0),
                    "f3"=c(0,0,0,1,1),
                    "c4"=c(1,1,0,1,0),
                    "f4"=c(1,0,0,0,1),
                    "c4"=c(NA,0,0,1,0))

I have 'dataHAVE' and seek to generate 'dataWANT' The rules are:

if f1 and c1 = 0, w1 = 0
if f1 = 1 and c1 = 0, w1 = 1
if f1 = 0 and c1 = 1, w1 = 2
if f1 = 1 and c1 = 1, w1 = 2

↧

Error in r function from using nrow and ncol

February 20, 2020, 8:03 am

≫ Next: Explanation of p-value in ols_step_forward_p function of olsrr package

≪ Previous: Calculate multiple columns based on threshold

Error in diag(nrow(V) * tausq, nrow = ncol(V), ncol = ncol(V)) : 'nrow' or 'ncol' cannot be specified when 'x' is a matrix

This is the error I get when I try to run

D <- diag(nrow(V)*tausq, nrow=ncol(V), ncol=ncol(V))

which is part of a function I wrote.

It's the first line of the function and V is a matrix which is part of the argument.

What does this error mean?

↧

Explanation of p-value in ols_step_forward_p function of olsrr package

February 20, 2020, 8:05 am

≫ Next: Plotting different x-axis using facet grids

≪ Previous: Error in r function from using nrow and ncol

Consider the following example:-(taken from the e.g. of R documentation)

model <- lm(y ~ ., data = surgical)
ols_step_forward_p(model)

The documentation says that it will be employing the Forward Stepwise method to select the final model. If I am not wrong, doesn't this forward selection is based on observing the p values of the following test statistic:-

where we are considering a model with K predictors all total. RSS denotes the residual sum of squares and the subscript p and K denote the RSS of a model with p and K predictors respectively.

If I do not give the command penter in ols_step_forward_p function what default p-value is chosen?

↧

Plotting different x-axis using facet grids

February 20, 2020, 8:07 am

≫ Next: How to call `eval` with `with` function?

≪ Previous: Explanation of p-value in ols_step_forward_p function of olsrr package

I am plotting spider-diagrams for geochemical analysis.

I managed to plot different series of elements separately (major elements and rare-earth elements) but I would like to plot them together using facet_grid. The problem I get is that the x-axis is common. I would like to have two separate x-axis like I show in the imgur post : https://imgur.com/a/7YnPio1

I've written commented codes on what I achieved :

library(readxl)
library(tidyr)
library(dplyr)
library(ggplot2)


data <- read_excel("Documents/TFB/xlsx_geochimie/solfa_total_tout_ppm.xlsx", 
                   col_types = c("text", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric", 
                                 "numeric", "numeric", "numeric", "numeric"))

### Vectors containing different geochemical series

vec_maj = c("SiO2","TiO2","Al2O3","FeO","MgO","CaO","Na2O","K2O")
vec_TR = c("La","Ce","Pr","Nd","Sm","Eu","Gd","Tb","Dy","Ho","Er","Tm","Yb","Lu")
vec_tout <- as.character(c(vec_maj,vec_TR))
data.mod <- data[vec_tout]
data.mod$Ech <- data$Ech
### Wide format to long format
data.lf = data %>% select(c(vec_tout,"Ech")) %>%
  pivot_longer(-Ech,names_to="Element",values_to="Pourcentage") %>%
  mutate(Element=factor(Element,levels=unique(vec_tout)))
### Plotting the series separately

data.maj <- subset(data.lf,data.lf$Element %in% vec_maj)
View(data.maj)
data.TR <- subset(data.lf,data.lf$Element %in% vec_TR)

ggplot(data=data.maj,mapping=aes(x=Element,y=Pourcentage,colour=Ech))+
  geom_point()+geom_line(aes(group=Ech))+scale_y_log10()

ggplot(data=data.TR,mapping=aes(x=Element,y=Pourcentage,colour=Ech))+
  geom_point()+geom_line(aes(group=Ech))+scale_y_log10()

# Plotting the series together, x-axis scales does not split :-( 
data.lf$Type <- ifelse(data.lf$Element %in% vec_maj,"Major","REE")
ggplot(data=data.lf,mapping=aes(x=Element,y=Pourcentage,colour=Ech))+
  geom_point()+geom_line(aes(group=Ech))+scale_y_log10()+facet_grid(Type~.,scales="free")

You may download my dataset here : google drive

↧

How to call `eval` with `with` function?

February 20, 2020, 8:07 am

≫ Next: separate output into multiple objects in R automatically

≪ Previous: Plotting different x-axis using facet grids

Having an lm object I need to create a function based on its variables represented as character vector. I have tried to use a combination of eval and expr to create an f function that would be further used in obj and nlm optimisation of the latter.

library(tidyverse)
df <- drop_na(airquality)
model <- lm(Ozone~. - Temp, data = df, x=TRUE, y=TRUE)
base_vars <- all.vars(formula(model)[-2])
k <- length(base_vars)

f <- function(base_df, x, y, parms) {
  with(base_df, parms[1] + 
         eval(expr(paste(paste(paste0('parms[', 2:(k+1), ']'), base_vars, sep = '*'), collapse = '+'))) + 
         log(parms[k+2] * (x - parms[k+3] ^ 2)))
}
obj <- function(parms, y, x) mean((residuals(model) - f(df, x, y, parms))^2) 
fit <- with(data, nlm(obj, c(0, 0, 0, 0, 0, 0, 0), y = e, x = x))

But calling f(model$x, df$Temp, model$y, c(0, 0, 0, 0, 0, 0, 0)) results in the following error:

Error in eval(substitute(expr), data, enclos = parent.frame()) : 
  numeric 'envir' arg not of length one 
4.
eval(substitute(expr), data, enclos = parent.frame()) 
3.
with.default(base_df, parms[1] + eval(expr(paste(paste(paste0("parms[", 
    2:(k + 1), "]"), base_vars, sep = "*"), collapse = "+"))) + 
    log(parms[k + 2] * (x - parms[k + 3]^2))) 
2.
with(base_df, parms[1] + eval(expr(paste(paste(paste0("parms[", 
    2:(k + 1), "]"), base_vars, sep = "*"), collapse = "+"))) + 
    log(parms[k + 2] * (x - parms[k + 3]^2))) 
1.
f(model$x, df$Temp, model$y, c(0, 0, 0, 0, 0, 0, 0))

I believe there might be a conflict between eval environment and environment implied by with function, but can't figure out why. Any ideas how can I create custom function f for variable models?

↧

separate output into multiple objects in R automatically

February 20, 2020, 8:08 am

≫ Next: How can I get my time strings in a right format?

≪ Previous: How to call `eval` with `with` function?

I'm not sure if this is possible, but is there a way in R to run a command and have it save the output into multiple objects based on group? For instance, I wrote a code that calculates what number of employees are in a supervisory role based on their department.

library(tidyverse)
sample <- tibble(department = c("Admin", "Admin", "Office of President", "Office of President"),
                 sup_status = c("Not Supervisor", "Supervisor", "Not Supervisor", "Supervisor"),
                 n = c(918, 152, 69, 192))

But, what I really want is a vector of the percentages of supervisors by department. I can get R to produce one long vector of all percentages:

library(tidyverse)

vector_of_all_percents <- sample %>%
  group_by(department) %>%
  mutate(sum_new = sum(n)) %>%
  rowwise() %>%
  mutate(percent = n/sum_new) %>%
  select(percent) %>%
  as_vector()

vector_of_all_percents
 percent1  percent2  percent3  percent4 
0.8579439 0.1420561 0.2643678 0.7356322

My actual data has many departments. Is there a way to adjust my above code to get R produce objects by department automatically, something like this:

vector_for_admin
 percent1  percent2
0.8579439 0.1420561 

vector_for_office
percent1  percent2 
0.2643678 0.7356322

I'm not sure if the slice() or split() commands are what I need, or if this is even possible. Any guidance would be very appreciated!

↧

How can I get my time strings in a right format?

February 20, 2020, 8:09 am

≫ Next: Replace NA in a POSIXct serie by adjacent values

≪ Previous: separate output into multiple objects in R automatically

I am stuck with converting strings to times. I am aware that there are many topics on Stack regarding converting strings-to-times, however I couldn't fix this problem with the solutions.

Situation I have a file with times like this:

> dput(df$Time[1:50])
c("1744.3", "2327.54", "1718.51", "2312.3200000000002", "1414.16", 
"2046.15", "1442.5", "1912.22", "2303.2199999999998", "2146.3200000000002", 
"1459.02", "1930.15", "1856.23", "2319.15", "1451.05", "25.460000000000036", 
"1453.25", "2309.02", "2342.48", "2322.5300000000002", "2101.5", 
"2026.07", "1245.04", "1945.15", "5.4099999999998545", "1039.5", 
"1731.37", "2058.41", "2030.36", "1814.31", "1338.18", "1858.33", 
"1731.36", "2343.38", "1733.27", "2304.59", "1309.47", "1916.11", 
"1958.3", "1929.54", "1756.4", "1744.23", "1731.26", "1844.47", 
"1353.25", "1958.3", "1746.44", "1857.53", "2047.15", "2327.2199999999998", "1915"
)

In this example, the times should be like this:

"1744.3"   = 17:44:30
"2327.54"  = 23:27:54
"1718.51"  = 17:18:51
"2312.3200000000002" = 23:12:32
...
"25.460000000000036" = 00:25:46 # as you can see, the first two 00 are missing.

"1915" = 19:15:00

However, I tried multiple things (and now I am even stuck with str_replace()). Hopefully some one knows how I can transform this.

What have I tried?

format(df$Time, "%H%M.%S") # Yes I know... 

# So therefore I thought, lets replace the strings to get them in a proper format
# like HH:MM:SS. First step was to replace the "." for a ":" 

str_replace("." , ":", df$Time) # this was leading to "." (don't know why)

And that was the point that I was so frustrated that I posted it on Stack. Hope that you guys can help me.

Many thanks in advance!

↧

Replace NA in a POSIXct serie by adjacent values

February 20, 2020, 8:10 am

≫ Next: ggplot graph not displaying in shiny app when reading multiple files - cannot open connection

≪ Previous: How can I get my time strings in a right format?

I've a data frame like this (but with much more rows):

  individ_id           date_time               begin           end
1: NOS_4214433 2017-11-22 09:01:49 2017-11-21 11:54:59 2017-11-22 09:07:27
2: NOS_4214433 2017-11-22 09:06:49 2017-11-21 11:54:59 2017-11-22 09:07:27
3: NOS_4214433 2017-11-22 09:11:49                <NA>                <NA>
4: NOS_4214433 2017-11-22 09:16:49                <NA>                <NA>
5: NOS_4214433 2018-01-24 12:12:18 2018-01-24 12:08:28 2018-01-25 09:33:10

and I want to fill the NA in the begin and end columns with the first NA date_time value for the 'begin' column and the last date_time NA value for the 'end' column like this:

    individ_id           date_time               begin                 end
1: NOS_4214433 2017-11-22 09:01:49 2017-11-21 11:54:59 2017-11-22 09:07:27
2: NOS_4214433 2017-11-22 09:06:49 2017-11-21 11:54:59 2017-11-22 09:07:27
3: NOS_4214433 2017-11-22 09:11:49 2017-11-22 09:11:49 2017-11-22 09:16:49
4: NOS_4214433 2017-11-22 09:16:49 2017-11-22 09:11:49 2017-11-22 09:16:49
5: NOS_4214433 2018-01-24 12:12:18 2018-01-24 12:08:28 2018-01-25 09:33:10

All the date-time data are in the POSIX format and I want to keep it that way. Does anyone have an idea to solve that issue?

↧

ggplot graph not displaying in shiny app when reading multiple files - cannot open connection

February 20, 2020, 8:14 am

≫ Next: ShinyApp works but runApp doesnt R

≪ Previous: Replace NA in a POSIXct serie by adjacent values

I write the code below for which is a simple app for displaying curves from data files produced by a lab equipment. I need to combine all the files in a chosen folder, do a simple data treatment, and then display the superpozed curves.

I get a cannot open the connection error on the plot area when I try to load the files. I also tried to display the multiple_datatable in a datatableoutput but I get the exact same error.

Edit : Thanks for your comment, here is a simpler version with only the required code.

Some data

file 1

c("#FILE:                       ;Grofillex 1000°C 10K 5' air _ AW70ter.ngb-ss1", 
"#FORMAT:                     ;NETZSCH5                                     ", 
"#IDENTITY:                   ;STA2020-018                                  ", 
"#DECIMAL:                    ;POINT                                        ", 
"#SEPARATOR:                  ;SEMICOLON                                    ", 
"#MTYPE:                      ;DSC                                          ", 
"#INSTRUMENT:                 ;NETZSCH STA 449F1                            ", 
"#PROJECT:                    ;Grosfillex                                   ", 
"#DATE/TIME:                  ;29/01/2020 11:04:11                          ", 
"#CORR. FILE:                 ;                                             ", 
"#LABORATORY:                 ;LFM                                          ", 
"#OPERATOR:                   ;AM                                           ", 
"#REMARK:                     ;                                             ", 
"#SAMPLE:                     ;Grosfillex AW70 ter                            ", 
"#SAMPLE MASS /mg:            ;23.99                                        ", 
"#MATERIAL:                   ;                                             ", 
"#REFERENCE:                  ;Référence                                    ", 
"#REFERENCE MASS /mg:         ;0                                            ", 
"#TYPE OF CRUCIBLE:           ;DSC/TG pan Al2O3                             ", 
"#SAMPLE CRUCIBLE MASS /mg:   ;216.98                                       ", 
"#REFERENCE CRUCIBLE MASS /mg:;214                                          ", 
"#CORR. CODE:                 ;000                                          ", 
"#EXO:                        ;+1                                           ", 
"#RANGE:                      ;20....1000/-40.0....10.0K/min                ", 
"#SEGMENT:                    ;S1-4/4                                       ", 
"", "##Temp./°C;Time/min;DSC/(uV/mg);Mass/%;Gas Flow(purge1)/(ml/min);Gas Flow(protective)/(ml/min);Sensit./(uV/mW);Segment", 
"  20.72500;0.0000e+000;-1.7173e-003;100.00000;80.00000;20.00000;1.00000;1", 
"  20.74233;5.0000e-002;-1.6219e-003;100.27209;80.00000;20.00000;1.00000;1", 
"  20.74440;    0.10000;-1.5583e-003; 99.84077;80.00000;20.00000;1.00000;1", 
"  20.74786;    0.15000;-1.5583e-003; 99.67799;80.00000;20.00000;1.00000;1", 
"  20.74822;    0.20000;-1.5265e-003; 99.54200;80.00000;20.00000;1.00000;1", 
"  20.74882;    0.25000;-1.5583e-003; 99.43706;80.00000;20.00000;1.00000;1", 
"  20.74938;    0.30000;-1.4629e-003; 99.34389;80.00000;20.00000;1.00000;1", 
"  20.75073;    0.35000;-1.3357e-003; 99.27095;80.00000;20.00000;1.00000;1", 
"  20.75147;    0.40000;-1.4629e-003; 99.20383;80.00000;20.00000;1.00000;1", 
"  20.75376;    0.45000;-1.5901e-003; 99.15830;80.00000;20.00000;1.00000;1", 
"  20.75500;    0.50000;-1.6537e-003; 99.12828;80.00000;20.00000;1.00000;1", 
"  20.75571;    0.55000;-1.5901e-003; 99.11922;80.00000;20.00000;1.00000;1", 
"  20.75788;    0.60000;-1.6855e-003; 99.10786;80.00000;20.00000;1.00000;1", 
"  20.75924;    0.65000;-1.9717e-003; 99.08993;80.00000;20.00000;1.00000;1", 
"  20.76088;    0.70000;-2.1308e-003; 99.07180;80.00000;20.00000;1.00000;1", 
"  20.76088;    0.75000;-2.2580e-003; 99.05336;80.00000;20.00000;1.00000;1", 
"  20.76212;    0.80000;-2.0990e-003; 99.03449;80.00000;20.00000;1.00000;1", 
"  20.76312;    0.85000;-2.2262e-003; 99.01772;80.00000;20.00000;1.00000;1", 
"  20.76624;    0.90000;-2.2580e-003; 99.00156;80.00000;20.00000;1.00000;1", 
"  20.76641;    0.95000;-2.3534e-003; 98.98437;80.00000;20.00000;1.00000;1", 
"  20.76782;    1.00000;-2.4170e-003; 98.96738;80.00000;20.00000;1.00000;1", 
"  20.77018;    1.05000;-2.3534e-003; 98.94946;80.00000;20.00000;1.00000;1", 
"  20.77147;    1.10000;-2.4170e-003; 98.93206;80.00000;20.00000;1.00000;1"
)

file 2

c("#FILE:                       ;Grofillex 1000°C 10K 5' air _ F1B.ngb-ss1", 
"#FORMAT:                     ;NETZSCH5                                 ", 
"#IDENTITY:                   ;STA2020-005                              ", 
"#DECIMAL:                    ;POINT                                    ", 
"#SEPARATOR:                  ;SEMICOLON                                ", 
"#MTYPE:                      ;DSC                                      ", 
"#INSTRUMENT:                 ;NETZSCH STA 449F1                        ", 
"#PROJECT:                    ;Grosfillex                               ", 
"#DATE/TIME:                  ;14/01/2020 16:33:22                      ", 
"#CORR. FILE:                 ;                                         ", 
"#LABORATORY:                 ;LFM                                      ", 
"#OPERATOR:                   ;AM                                       ", 
"#REMARK:                     ;                                         ", 
"#SAMPLE:                     ;Grosfillex F1B                           ", 
"#SAMPLE MASS /mg:            ;23.30                                    ", 
"#MATERIAL:                   ;                                         ", 
"#REFERENCE:                  ;Référence                                ", 
"#REFERENCE MASS /mg:         ;0                                        ", 
"#TYPE OF CRUCIBLE:           ;DSC/TG pan Al2O3                         ", 
"#SAMPLE CRUCIBLE MASS /mg:   ;216.31                                   ", 
"#REFERENCE CRUCIBLE MASS /mg:;214                                      ", 
"#CORR. CODE:                 ;000                                      ", 
"#EXO:                        ;+1                                       ", 
"#RANGE:                      ;20....1000/-40.0....10.0K/min            ", 
"#SEGMENT:                    ;S1-4/4                                   ", 
"", "##Temp./°C;Time/min;DSC/(uV/mg);Mass/%;Gas Flow(purge1)/(ml/min);Gas Flow(protective)/(ml/min);Sensit./(uV/mW);Segment", 
"  22.27000;0.0000e+000; 1.4735e-003;100.00000;80.00000;20.00000;1.00000;1", 
"  22.25833;5.0000e-002; 1.2770e-003;100.39850;80.00000;20.00000;1.00000;1", 
"  22.26100;    0.10000; 9.4958e-004; 99.79292;80.00000;20.00000;1.00000;1", 
"  22.26100;    0.15000; 8.8409e-004; 99.62661;80.00000;20.00000;1.00000;1", 
"  22.26267;    0.20000; 8.1860e-004; 99.50172;80.00000;20.00000;1.00000;1", 
"  22.26427;    0.25000; 6.8763e-004; 99.38573;80.00000;20.00000;1.00000;1", 
"  22.26492;    0.30000; 5.5665e-004; 99.29442;80.00000;20.00000;1.00000;1", 
"  22.26620;    0.35000; 4.9116e-004; 99.22543;80.00000;20.00000;1.00000;1", 
"  22.26606;    0.40000; 3.9293e-004; 99.14893;80.00000;20.00000;1.00000;1", 
"  22.26624;    0.45000; 2.6195e-004; 99.11513;80.00000;20.00000;1.00000;1", 
"  22.26700;    0.50000; 2.2921e-004; 99.07629;80.00000;20.00000;1.00000;1", 
"  22.26806;    0.55000;-6.5488e-005; 99.02597;80.00000;20.00000;1.00000;1", 
"  22.26859;    0.60000;-6.5488e-005; 99.01266;80.00000;20.00000;1.00000;1", 
"  22.26847;    0.65000;-3.2744e-005; 98.99828;80.00000;20.00000;1.00000;1", 
"  22.26924;    0.70000;-3.6019e-004; 98.98938;80.00000;20.00000;1.00000;1", 
"  22.27012;    0.75000;-2.9470e-004; 98.97006;80.00000;20.00000;1.00000;1", 
"  22.26982;    0.80000;-4.5842e-004; 98.94635;80.00000;20.00000;1.00000;1", 
"  22.27241;    0.85000;-7.2037e-004; 98.91567;80.00000;20.00000;1.00000;1", 
"  22.27235;    0.90000;-6.8763e-004; 98.89496;80.00000;20.00000;1.00000;1", 
"  22.27435;    0.95000;-8.8409e-004; 98.87800;80.00000;20.00000;1.00000;1", 
"  22.27600;    1.00000;-9.8233e-004; 98.86030;80.00000;20.00000;1.00000;1", 
"  22.27671;    1.05000;-9.1684e-004; 98.83444;80.00000;20.00000;1.00000;1", 
"  22.27759;    1.10000;-1.1788e-003; 98.81330;80.00000;20.00000;1.00000;1"
)

and the code

library(shiny)
library(xlsx)
library(readxl)
library(dplyr)
library(ggplot2)
library(reshape2)
library(viridis)
library(zoo)
library(lubridate)
library(shinyFiles)
library(htmltools)

ui <- fluidPage(
  navbarPage("ATG-DSC App",
             tabPanel("Multiple file comparison",
                      pageWithSidebar(
                      headerPanel(""),
                      sidebarPanel(width = 2,shinyDirButton('directory', 'Folder select', 'Please select a folder')),

                 mainPanel(fluidRow( 
                                column(12, 

                                      dataTableOutput("multiple_datatable"),
                                      uiOutput("plot.ui2"))))))))


server = function(input, output, session){

  volumes <- getVolumes()
  shinyDirChoose(input, 'directory', roots=volumes, session=session)
  path1 <- reactive({
    return(print(parseDirPath(volumes, input$directory)))
  })

filenames <- eventReactive(input$directory, {
  file.names <- dir(path1(), pattern =".txt")
  return(file.names)
  })

#data reading
dataset_multiple <- eventReactive(input$directory, {
  out.file<-""
  file.names <- dir(path1(), pattern =".txt")
  for(i in 1:length(file.names)){
    lines <- readLines(file.names[i])
    column_names <- gsub("#",'',unlist(strsplit(lines[27],";")))
    header <- lines[1:26]
    lines <- lines[seq(27,length(lines),5)]
    col <- length(column_names)
    data <- as.data.frame(matrix(unlist(strsplit(lines,";")),ncol=col,byrow=T))
    data <- data[2:nrow(data),]
    colnames(data) <- column_names
    for (i in 1:ncol(data)){
      data[,i] <- as.numeric(gsub(",",".",as.character(data[,i])))
    }
    data$sample <- unlist(strsplit(header,";"))[28]
    out.file <- rbind(out.file, data)
    for (i in 1:(ncol(out.file)-1)){
      out.file[,i] <- as.numeric(gsub(",",".",as.character(out.file[,i])))
    }
  }

  out.file$`Sensit./(uV/mW)` <- NULL 
  out.file$`Time/min` <- NULL
  out.file <- out.file[complete.cases(out.file),]
  out.file <- out.file[out.file$Segment==2,]
  out.file$Segment <- NULL
  out.file$`Gas Flow(protective)/(ml/min)` <- NULL
  out.file$`Gas Flow(purge1)/(ml/min)` <- NULL
  names(out.file) <- c("Temp", "DSC", "Mass", "Sample")
  return(out.files)
  })

output$multiple_plot <- renderPlot({

  multiple_data <- dataset_multiple()
  multiple_data_melt <- melt(multiple_data, id=c("Temp","Sample"))
  col_pal <- color_palette()

  g <- ggplot(multiple_data_melt,aes(multiple_data_melt$Temp,multiple_data_melt$value, group=sample, color=sample))
  + geom_point()
  + facet_grid(variable~.,scales = "free_y")

  print(g)
})


  output$multiple_datatable <- eventReactive(input$directory, {
    multiple_data <- dataset_multiple()
    multiple_data_melt <- melt(multiple_data, id=c("Temp","Sample"))
    return(head(multiple_data_melt))
  })


  output$plot.ui2 <- renderUI({
    plotOutput("multiple_plot")
  })
}

shinyApp(ui = ui, server = server, options = list(launch.browser=TRUE))

Can someone help ?

↧

ShinyApp works but runApp doesnt R

February 20, 2020, 8:19 am

≫ Next: Why does `substitute` work in multiple lines, but not in a single line?

≪ Previous: ggplot graph not displaying in shiny app when reading multiple files - cannot open connection

Certain widgets are not working correctly when I use runApp, but when I use shinyApp, everything works fine. When I use runApp, I am specifying a host and port.

The widgets that are not working are pickerInput from shinyWidgets.

I think it's possibly a website not secure issue, not sure though.

This is what the site looks like with shinyApp:

This is what the site looks like with runApp:

EDIT:

here is example code:

library(shiny)
library(shinyWidgets)

app = shinyApp(
  ui = fluidPage(
    DT::dataTableOutput("responses", width = 300), tags$hr(),
    textInput("name", "Name", ""),
    checkboxInput("used_shiny", "I've built a Shiny app in R before", FALSE),
    sliderInput("r_num_years", "Number of years using R",
                0, 25, 2, ticks = FALSE),
    actionButton("submit", "Submit"),
    pickerInput(
        inputId = "lvl1", 
        label = "Level 1", 
        choices = c('one', 'two', 'three'), 
        options = list(
            `actions-box` = TRUE, 
            size = 10,
            `selected-text-format` = "count > 1"
        ), 
        multiple = TRUE
    )
  ),
  server = function(input, output, session) {


    output$responses <- DT::renderDataTable({
      data.frame(a = input$submit)
    })     
  }
)

if I just type this into RStudio, it works:

app

But, if I use runApp, the pickerInput widget does not display correctly in browser.

runApp(app, host="<some ip address>")

EDIT 2: if I use runApp(app), and open in browser, the app displays correctly. However, if I change the url in browser from 127.0.0.1:port to localhost:port, it stops displaying the widget correctly.

↧