Add row to data frame with dplyr

February 20, 2020, 9:18 am

≫ Next: Data.Table Operation From One DataFrame to Another

≪ Previous: Asselin mode computing in Python [closed]

I have this sample data:

cvar <- c("2015-11-01","2015-11-02","All")
nvar1 <- c(12,10,5)
nvar2 <- c(7,5,6)
data <- cbind.data.frame(cvar,nvar1,nvar2)

And I just want to add a new row to the data.frame containing the sums of nvar1 & nvar2 and a character, so with base R I would just use

data[nrow(data)+1,] <- c("add",sum(data[,2]),sum(data[,3]))

or something more clever with lapply, but just to show you what I'm looking for.

I would like this simple command within the pipe environment, so data %>% ... gives me the above outcome.

Appreciate any help, thank you.

↧

Data.Table Operation From One DataFrame to Another

February 20, 2020, 9:20 am

≫ Next: R dplyr create multiple columns efficiently with condition

≪ Previous: Add row to data frame with dplyr

dataWANT=data.frame("student"=c(1,2,3,4,5),
                    "w1"=c(2,2,0,2,1),
                    "w2"=c(2,0,0,2,1),
                    "w3"=c(2,2,0,2,1),
                    "w4"=c(1,0,0,1,2))


dataHAVE=data.frame("student"=c(1,2,3,4,5),
                    "f1"=c(0,0,0,1,1),
                    "c1"=c(1,1,0,1,0),
                    "f2"=c(1,0,0,0,1),
                    "c2"=c(1,0,0,1,0),
                    "f3"=c(0,0,0,1,1),
                    "c4"=c(1,1,0,1,0),
                    "f4"=c(1,0,0,0,1),
                    "c4"=c(NA,0,0,1,0))

I have 'dataHAVE' and seek to generate 'dataWANT' The rules are:

if f1 and c1 = 0, w1 = 0
if f1 = 1 and c1 = 0, w1 = 1
if f1 = 0 and c1 = 1, w1 = 2
if f1 = 1 and c1 = 1, w1 = 2

Basically I am wondering: how can I generate this variables and secondly, how can I execute a data.table function on dataHAVE while putting the new variables in dataWANT?

↧

R dplyr create multiple columns efficiently with condition

February 20, 2020, 9:21 am

≫ Next: R: Loop Function Across Columns and Create New (named) Columns for Each Function Output

≪ Previous: Data.Table Operation From One DataFrame to Another

Let's say I have this tibble :

 tb <- tribble(
  ~siren_ent, ~region_etab,
  "a",   "11",
  "b",   "32",
  "c",   "76"
)

and I would like to add 3 new columns like this :

result <- tribble(
  ~siren_ent, ~region_etab, ~reg11, ~reg21, ~reg76,
  "a",   "11", 1,0,0,
  "b",   "32", 0,1,0,
  "c",   "76", 0,0,1
)

It works with that lines but it's not effective with a lot of colums...

tb %>% 
  mutate(
    reg11=if_else(region_etab=="11",1,0),
    reg32=if_else(region_etab=="32",1,0),
    reg76=if_else(region_etab=="76",1,0)
  )

Any advice to do it with dplyr and maybe a function(x) ? Many thanks !

↧

R: Loop Function Across Columns and Create New (named) Columns for Each Function Output

February 20, 2020, 9:26 am

≫ Next: specify order of variables in position dodge

≪ Previous: R dplyr create multiple columns efficiently with condition

I am trying to:

generate Moran's I Estimates and P-Values for variables in columns 11:27 of the @data portion of a SpatialPointsDataFrame, for a given year of data (below is 2016)
repeat (1) for all years of data, and bind these together: resulting in a complete SpatialPointsDataFrame which has 2 columns containing these Moran's measures named "LM_(variable)" and "LM(p)_(variable)" for each of the original variables.

I have defined a function to perform the first Moran's estimate for a single year subset but can't figure out how to add these as appropriately named columns to the dataframe OR loop these. See (incorrect) code below:

OA.2016 <- OA.Merge.1[OA.Merge.1$Year==2016, ]

LM.1 <- function(i){

  LM.i <- localmoran(x = i, listw = nb2listw(neighbours.queen.2016, style = "W", zero.policy=TRUE), zero.policy=TRUE)
  colnames(LM.i) <- c("Estimate","Standard Error","Variance","Z-Score","P-Value")

  LM.i <- LM.i[, c(1)]
  LM.i <- as.data.frame(LM.i)

  OA.2016@data <- cbind(OA.2016@data,LM.i)

}

I then try to apply this (and a similar function for the p-value) in a loop:

for(i in OA.2016@data[11:27]){
  OA.2016[[paste("LM_", i,sep="")]]<-LM.1(OA.2016[[i]])
  OA.2016[[paste("LM(p)_", i,sep="")]]<-LM.2(OA.2016[[i]])
}

Ideally all stages could be combined so that the input is a SpatialPolgonsDataFrame with unit Year//Census-Tract and the output is a SpatialPolygonsDataFrame with the same unit and rows, but many additional named columns containing correct Moran's estimates, calculated ONLY for Census Tracts in the specific year (hence the need to specify the variables on a year subset, then loop over years and re-combine).

ANY help on this would be amazing! Thanks.

↧

specify order of variables in position dodge

February 20, 2020, 9:27 am

≫ Next: How to retrieve emails by date using RDCOMClient

≪ Previous: R: Loop Function Across Columns and Create New (named) Columns for Each Function Output

I honestly don't know why this is being so hard.

I'm creating a simple scatter plot. The x axis is a continuous variable, and at every tick in x I need to plot four points with error bars. I'm using position dodge and everything works fine.

Each point has a different color, size and shape as governed by three further variables: color and shape are governed by factors, size by a continuous variable.

By default, the four points reflect the order of the levels in the color variable (red always left, then green, then blue) but I would like them to reflect the order of the size variable (the continuous one), smallest left and largest right. How do I specify that size should be prioritised when ordering points in position dodge? I tried using reverse ordering but then the points are ordered first according to the shape legend.

I could change the mapping between variable and aesthetics (all variables are fundamentally continuous and could be used with size) but I think it'd be useful to know how to specify the order in which multiple variables should be considered when dodging points.

↧

How to retrieve emails by date using RDCOMClient

February 20, 2020, 9:30 am

≫ Next: Saving a dataset through a function, but it is not loaded afterwards

≪ Previous: specify order of variables in position dodge

I am trying to retrieve only emails received "today" from particuarly folder in my outlook inbox. How would I be able to do this? The follow code below allows me to extract emails all from a inbox, but I am only interested in emails that were received today. What would I add to my code?

folderName <-  "Folder2"

## create outlook object
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace <-  OutApp$GetNameSpace("MAPI")

folder <- outlookNameSpace$GetDefaultFolder(6)
fld <-  folder$folders(folderName)
cnt <-  fld$Items()$Count()

emails <- fld$items
resp <-  data.frame(sno = 1:cnt,Text = "",stringsAsFactors=FALSE)

for(i in seq(cnt)){
  d <-  as.data.frame(emails(i)$Body(), stringsAsFactors=FALSE)
  resp$Text[i] = d[1]
  resp$Sender[i] = emails(i)[['SenderName']]
  resp$To[i] = emails(i)[['To']]
  resp$sub[i] = emails(i)[['subject']]
}

↧

Saving a dataset through a function, but it is not loaded afterwards

February 20, 2020, 9:30 am

≫ Next: Order list of strings by last number in string in R

≪ Previous: How to retrieve emails by date using RDCOMClient

I have created a function that reads some text data from a certain file, does some manipulation (omitted here) and then saves each modified dataframe in this list as .RData. I have checked that the function does its job. However, when loading the output again into RStudio, the load command runs without errors, but there is no new object in my environment.

Any possible fixes?

f <- function(directory_input, directory_output, par1, par2){
library(tidyverse)
library(readxl)
if(dir.exists(directory_output) == F) {
        dir.create(directory_output)
      }
      key <- data.frame(par= as.character(paste0(0, par1, par2)))
      paths <- key %>%  mutate(
        filepath_in = file.path(directory_input, paste0(par, '.txt'), sep = ''),
        filepath_out = file.path(directory_output, paste0(par, '.RData'), sep = '')
      )

    filepath_in <- paths$filepath_in
    filepath_out <- paths$filepath_out

    DF <- filepath_in %>% map( ~ .x %>% read.delim2(., encoding = 'Latin-1', nrows = 1000))
    map2(DF, filepath_out, ~ .x %>% save(file = .y))

}

↧

Order list of strings by last number in string in R

February 20, 2020, 9:32 am

≫ Next: kableExtra won't compile with full_width and XeLaTeX

≪ Previous: Saving a dataset through a function, but it is not loaded afterwards

I have the following list:

datalist <- c("20191107_1545_28.xlsx","20191108_1520_95.xlsx",""20191108_1104_99.xlsx"","20200127_1505_28.xlsx", "20200124_1505_41B.xlsx", "20200122_1505_1.xlsx", "20191102_1520_102.xlsx")

which I want to order by the last number, and then by the first number(date), so that is looks like:

"20200122_1505_1.xlsx""20191107_1545_28.xlsx""20200127_1505_28.xlsx""20200124_1505_41B.xlsx""20191108_1520_95.xlsx""20191104_1106_99.xlsx""20191102_1520_102.xlsx"

I have been playing around with StrReverse, so I could then just order it normally, but unfortunately, it of course also reverses the number. I tried to split the string first:

split=str_split(datalist, "_")

but I don't know how to continue. The the number that I want to order with could be 1, 2 or 3 digits and could also contain a B (like in the example). Does anyone know how to fix this? Thank in advance!

↧

kableExtra won't compile with full_width and XeLaTeX

February 20, 2020, 9:34 am

≫ Next: Multiply subset of columns by values in second data frame using match in R

≪ Previous: Order list of strings by last number in string in R

Having full_width = T in my kable function results in the error:

    ! You can't use `\relax' after \the.
\tabu@elapsedtime ...optime {\the \pdfelapsedtime 
                                                  }\tabu@message {(tabu)\tab...

But removing - \usepackage{fontspec} and latex_engine: xelatex from the YAML allows it to work.

Reproducible code:

---
title: "For Stackoverflow"
output:
  pdf_document:
    latex_engine: xelatex
    keep_tex: true
header-includes:
- \usepackage{booktabs}
- \usepackage{longtable}
- \usepackage{array}
- \usepackage{multirow}
- \usepackage{wrapfig}
- \usepackage{float}
- \usepackage{colortbl}
- \usepackage{pdflscape}
- \usepackage{tabu}
- \usepackage{threeparttable}
- \usepackage{threeparttablex}
- \usepackage[normalem]{ulem}
- \usepackage{makecell}
- \usepackage{xcolor}
- \usepackage{fontspec}
---


```{r message=FALSE, warning=FALSE}

library(kableExtra)

data <- data.frame('Column 1'=c('1','2','3','4','5'), 'Column 2'=c('a','b','c', 'd', 'e'))

kable(data, 'latex') %>%
  kable_styling(full_width = T)

```

Here is the tex file:

\begin{document}
\maketitle

\begin{Shaded}
\begin{Highlighting}[]
\KeywordTok{library}\NormalTok{(kableExtra)}

\NormalTok{data <-}\StringTok{ }\KeywordTok{data.frame}\NormalTok{(}\StringTok{'Column 1'}\NormalTok{=}\KeywordTok{c}\NormalTok{(}\StringTok{'1'}\NormalTok{,}\StringTok{'2'}\NormalTok{,}\StringTok{'3'}\NormalTok{,}\StringTok{'4'}\NormalTok{,}\StringTok{'5'}\NormalTok{), }\StringTok{'Column 2'}\NormalTok{=}\KeywordTok{c}\NormalTok{(}\StringTok{'a'}\NormalTok{,}\StringTok{'b'}\NormalTok{,}\StringTok{'c'}\NormalTok{, }\StringTok{'d'}\NormalTok{, }\StringTok{'e'}\NormalTok{))}

\KeywordTok{kable}\NormalTok{(data, }\StringTok{'latex'}\NormalTok{) }\OperatorTok{%>%}
\StringTok{  }\KeywordTok{kable_styling}\NormalTok{(}\DataTypeTok{full_width =}\NormalTok{ T)}
\end{Highlighting}
\end{Shaded}

\begin{tabu} to \linewidth {>{\raggedright}X>{\raggedright}X}


\hline
Column.1 & Column.2\\
\hline
1 & a\\
\hline
2 & b\\
\hline
3 & c\\
\hline
4 & d\\
\hline
5 & e\\
\hline
\end{tabu}



\end{document}

It looks like the main difference between the tex files of full_width=T and full_width=F is that when true, tabu is used; while when false, tabular is used.

Here is my session info:

- Session info -------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.6.0 (2019-04-26)
 os       Windows 7 x64 SP 1          
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                                    
 date     2020-02-20                  

- Packages -----------------------------------------------------------------------------------------------------
 ! package              * version    date       lib source                               
   assertthat             0.2.1      2019-03-21 [1] CRAN (R 3.6.0)                       
   backports              1.1.4      2019-04-10 [1] CRAN (R 3.6.0)                       
   boot                   1.3-22     2019-04-02 [1] CRAN (R 3.6.0)                       
   broom                  0.5.2      2019-04-07 [1] CRAN (R 3.6.0)                       
   callr                  3.2.0      2019-03-15 [1] CRAN (R 3.6.0)                       
   cellranger             1.1.0      2016-07-27 [1] CRAN (R 3.6.0)                       
   cli                    1.1.0      2019-03-19 [1] CRAN (R 3.6.0)                       
   codetools              0.2-16     2018-12-24 [1] CRAN (R 3.6.0)                       
   colorspace             1.4-1      2019-03-18 [1] CRAN (R 3.6.0)                       
   crayon                 1.3.4      2017-09-16 [1] CRAN (R 3.6.0)                       
   curl                   3.3        2019-01-10 [1] CRAN (R 3.6.0)                       
   data.table             1.12.2     2019-04-07 [1] CRAN (R 3.6.0)                       
   DBI                    1.0.0      2018-05-02 [1] CRAN (R 3.6.0)                       
   desc                   1.2.0      2018-05-01 [1] CRAN (R 3.6.0)                       
   devtools               2.2.1      2019-09-24 [1] CRAN (R 3.6.0)                       
   digest                 0.6.19     2019-05-20 [1] CRAN (R 3.6.0)                       
   dplyr                * 0.8.1      2019-05-14 [1] CRAN (R 3.6.0)                       
   ellipsis               0.3.0      2019-09-20 [1] CRAN (R 3.6.1)                       
   evaluate               0.14       2019-05-28 [1] CRAN (R 3.6.0)                       
   forcats              * 0.4.0      2019-02-17 [1] CRAN (R 3.6.0)                       
   foreach                1.4.4      2017-12-12 [1] CRAN (R 3.6.0)                       
   formatR                1.6        2019-03-05 [1] CRAN (R 3.6.0)                       
   fs                     1.3.1      2019-05-06 [1] CRAN (R 3.6.0)                       
   futile.logger          1.4.3      2016-07-10 [1] CRAN (R 3.6.0)                       
   futile.matrix          1.2.7      2018-04-20 [1] CRAN (R 3.6.0)                       
   futile.options         1.0.1      2018-04-20 [1] CRAN (R 3.6.0)                                                    
   generics               0.0.2      2018-11-29 [1] CRAN (R 3.6.0)                       
   ggplot2              * 3.1.1      2019-04-07 [1] CRAN (R 3.6.0)                       
   glue                   1.3.1      2019-03-12 [1] CRAN (R 3.6.0)                       
   gtable                 0.3.0      2019-03-25 [1] CRAN (R 3.6.0)                       
   haven                  2.1.0      2019-02-19 [1] CRAN (R 3.6.0)                       
   hms                    0.4.2      2018-03-10 [1] CRAN (R 3.6.0)                       
   htmltools              0.3.6      2017-04-28 [1] CRAN (R 3.6.0)                       
   httr                   1.4.0      2018-12-11 [1] CRAN (R 3.6.0)                       
   iterators              1.0.10     2018-07-13 [1] CRAN (R 3.6.0)                       
   jsonlite               1.6        2018-12-07 [1] CRAN (R 3.6.0)                       
   kableExtra           * 1.1.0.9000 2019-11-05 [1] Github (haozhu233/kableExtra@a9504c0)
   knitr                * 1.25       2019-09-18 [1] CRAN (R 3.6.1)                       
   lambda.r               1.2.3      2018-05-17 [1] CRAN (R 3.6.0)                       
   lambda.tools           1.0.9      2016-05-11 [1] CRAN (R 3.6.0)                       
   lattice                0.20-38    2018-11-04 [1] CRAN (R 3.6.0)                       
   lazyeval               0.2.2      2019-03-15 [1] CRAN (R 3.6.0)                       
   lpSolve                5.6.13     2015-09-19 [1] CRAN (R 3.6.0)                       
   lubridate              1.7.4      2018-04-11 [1] CRAN (R 3.6.0)                       
   magrittr               1.5        2014-11-22 [1] CRAN (R 3.6.0)                       
   memoise                1.1.0      2017-04-21 [1] CRAN (R 3.6.0)                       
   modelr                 0.1.4      2019-02-18 [1] CRAN (R 3.6.0)                       
   munsell                0.5.0      2018-06-12 [1] CRAN (R 3.6.0)                       
   nlme                   3.1-139    2019-04-09 [1] CRAN (R 3.6.0)                       
   numDeriv               2016.8-1   2016-08-27 [1] CRAN (R 3.6.0)                       
   PerformanceAnalytics   1.5.2      2018-03-02 [1] CRAN (R 3.6.0)                       
   pillar                 1.4.0      2019-05-11 [1] CRAN (R 3.6.0)                       
   pkgbuild               1.0.3      2019-03-20 [1] CRAN (R 3.6.0)                       
   pkgconfig              2.0.2      2018-08-16 [1] CRAN (R 3.6.0)                       
   pkgload                1.0.2      2018-10-29 [1] CRAN (R 3.6.0)                       
   plyr                   1.8.4      2016-06-08 [1] CRAN (R 3.6.0)                                                 
   pracma                 2.2.5      2019-04-09 [1] CRAN (R 3.6.0)                       
   prettyunits            1.0.2      2015-07-13 [1] CRAN (R 3.6.0)                       
 D processx               3.3.1      2019-05-08 [1] CRAN (R 3.6.0)                       
   ps                     1.3.0      2018-12-21 [1] CRAN (R 3.6.0)                       
   purrr                * 0.3.2      2019-03-15 [1] CRAN (R 3.6.0)                       
   quadprog               1.5-7      2019-05-06 [1] CRAN (R 3.6.0)                       
   quantmod               0.4-14     2019-03-24 [1] CRAN (R 3.6.0)                       
   R.methodsS3            1.7.1      2016-02-16 [1] CRAN (R 3.6.0)                       
   R.oo                   1.22.0     2018-04-22 [1] CRAN (R 3.6.0)                       
   R6                     2.4.0      2019-02-14 [1] CRAN (R 3.6.0)                       
   Rcpp                   1.0.1      2019-03-17 [1] CRAN (R 3.6.0)                       
   readr                * 1.3.1      2018-12-21 [1] CRAN (R 3.6.0)                       
   readxl                 1.3.1      2019-03-13 [1] CRAN (R 3.6.0)                       
   registry               0.5-1      2019-03-05 [1] CRAN (R 3.6.0)                       
   remotes                2.1.0      2019-06-24 [1] CRAN (R 3.6.1)                       
   reshape2               1.4.3      2017-12-11 [1] CRAN (R 3.6.0)                       
   rlang                  0.3.4      2019-04-07 [1] CRAN (R 3.6.0)                       
   rmarkdown            * 1.12       2019-03-14 [1] CRAN (R 3.6.0)                       
   RMTstat                0.3        2014-11-01 [1] CRAN (R 3.6.0)                       
   RODBC                  1.3-15     2017-04-13 [1] CRAN (R 3.6.0)                       
   ROI                    0.3-2      2019-01-23 [1] CRAN (R 3.6.0)                       
   RPostgreSQL            0.6-2      2017-06-24 [1] CRAN (R 3.6.0)                       
   rprojroot              1.3-2      2018-01-03 [1] CRAN (R 3.6.0)                       
   rstudioapi             0.10       2019-03-19 [1] CRAN (R 3.6.0)                       
   rvest                  0.3.4      2019-05-15 [1] CRAN (R 3.6.0)                       
   scales                 1.0.0      2018-08-09 [1] CRAN (R 3.6.0)                       
   sessioninfo            1.1.1      2018-11-05 [1] CRAN (R 3.6.0)                       
   slam                   0.1-45     2019-02-26 [1] CRAN (R 3.6.0)                       
   stringi                1.4.3      2019-03-12 [1] CRAN (R 3.6.0)                       
   stringr              * 1.4.0      2019-02-10 [1] CRAN (R 3.6.0)                       
   tawny                  2.1.7      2018-04-20 [1] CRAN (R 3.6.0)                       
   tawny.types            1.1.5      2018-04-20 [1] CRAN (R 3.6.0)                       
   testthat             * 2.1.1      2019-04-23 [1] CRAN (R 3.6.0)                       
   tibble               * 2.1.1      2019-03-16 [1] CRAN (R 3.6.0)                       
   tidyr                * 0.8.3      2019-03-01 [1] CRAN (R 3.6.0)                       
   tidyselect             0.2.5      2018-10-11 [1] CRAN (R 3.6.0)                       
   tidyverse            * 1.2.1      2017-11-14 [1] CRAN (R 3.6.0)                       
   tinytex              * 0.17       2019-10-30 [1] CRAN (R 3.6.1)                       
   TTR                    0.23-4     2018-09-20 [1] CRAN (R 3.6.0)                       
   usethis                1.5.0      2019-04-07 [1] CRAN (R 3.6.0)                       
   viridisLite            0.3.0      2018-02-01 [1] CRAN (R 3.6.0)                       
   webshot                0.5.1.9001 2019-09-25 [1] Github (wch/webshot@4bbf4f7)         
   withr                  2.1.2      2018-03-15 [1] CRAN (R 3.6.0)                       
   xfun                   0.11       2019-11-12 [1] CRAN (R 3.6.0)                       
   XML                    3.98-1.19  2019-03-06 [1] CRAN (R 3.6.0)                       
   xml2                   1.2.0      2018-01-24 [1] CRAN (R 3.6.0)                       
   xts                    0.11-2     2018-11-05 [1] CRAN (R 3.6.0)                       
   yaml                   2.2.0      2018-07-25 [1] CRAN (R 3.6.0)                       
   zoo                  * 1.8-5      2019-03-21 [1] CRAN (R 3.6.0)

Anyone have any ideas? Thank you for any help.

↧

Multiply subset of columns by values in second data frame using match in R

February 20, 2020, 9:35 am

≫ Next: Use the histogram for comparison

≪ Previous: kableExtra won't compile with full_width and XeLaTeX

I have a data frame which looks like this

data <- read.table(text="
  Country A B
1 FRA     1 2
2 GER     2 1
", header=TRUE)

I have a reference data frame which looks like this

ref <- read.table(text="
  Names Values
1     A      5
2     B     10
", header=TRUE)

I want to multiply each column by corresponding row in Ref having same Name (while retaining non-numeric rows without a match)

the result should be this

> result
  Country  A  B
1 FRA      5 20
2 GER     10 10

↧

Use the histogram for comparison

February 20, 2020, 9:35 am

≫ Next: R Shiny server : How can i have an event that stops from sourcing my Rscript

≪ Previous: Multiply subset of columns by values in second data frame using match in R

I have a distribution f(x) = ((3/2)/(1+x)^2) and I use the inverse sampling method to simulate values according to the distribution f(x).

The steps for sampling is to find the cumulative distribution function cdf for f(x) then to calculate its inverse for a uniformly random value sampled from [0,1].

I found the cdf as (3x)/2(x+1) and my code for sampling is

U <- runif(1000,0,1) # select a random value from the uniform [0,1]
X = ((3/2)*U/(1-(3/2)*U)) #find the inverse cdf for the selected value U

The simulated values will distribute according to the density f(x).

Now, I want to plot a histogram of the simulated data, then using function curve() , to plot over the histogram the density function f(x) which is defined over [0,2]

 f <- function(x) { ((3/2)/(1+x)^2) } #to define the density function f(x)
 hist(X, breaks= 50, freq=T,plot = TRUE)

Then I should use the function to add the curve over the plot

curve(......, add=TRUE)

but my problem

I don't know how define the range of the function in a very simple and basic way
the resulted histogram seems strange

↧

R Shiny server : How can i have an event that stops from sourcing my Rscript

February 20, 2020, 9:36 am

≫ Next: Calculating network statistics between attribute classes with igraph in R

≪ Previous: Use the histogram for comparison

I have a shiny app that sources a specific script ("Externalscript.R") when I click on an input ("start"), listening on the app reactives ("mylist") and then showing the verbose through ShinyJS. It works great but that script is often super long and blocks all consequent actions.

What I would like to do is to have another input ("stop") that actually stops that script process altogether, so that I don't have to wait for it to be finished before i can use another reactive on the app.

Here's my code:

observeEvent(input$start, {

   req(credentials()$user_auth)


   withCallingHandlers({
     shinyjs::html("texttech", "")
     source("Externalscript.R", local = list2env(mylist()))
   },
   message = function(m) {
     shinyjs::html(id = "texttech", html = paste('<div class="box-header">',m$message,'</div><br />',sep=""), add = TRUE)
   })
 })

Do you have any idea how to achieve that ?

Thanks.

↧

Calculating network statistics between attribute classes with igraph in R

February 20, 2020, 9:37 am

≫ Next: Aggregate data and plot in a bar plot in R

≪ Previous: R Shiny server : How can i have an event that stops from sourcing my Rscript

I am using igraph version 1.2.4.2 in R 3.5.2 to analyse network data. The vertices (nodes) have categorical attributes like “Sex” and “Age_class”, while the edges are undirected and weighted. I imported the adjacency matrix and attached the vertex attributes using the “set_vertex_attr” command I would like to calculate network metrics such as betweenness and strength not only of the global network, but also between and within the attribute classes, i.e. betweenness of the weighted connection between female-female or male-female.

I am able to calculate the within-class network statistics by removing vertices of other attribute class, e.g.

gMM <- delete.vertices(g, V(g)[Sex != 'M'])    # making a network of only males
betweenness(gMM, direction = F)    # calculating male-male only betweenness

However, this method does not work on between-class statistics, I wonder if anyone knows how to calculate between-class statistics in igraph, thank you.

↧

Aggregate data and plot in a bar plot in R

February 20, 2020, 9:37 am

≫ Next: Error system is computationally singular when using arellano matrix in r

≪ Previous: Calculating network statistics between attribute classes with igraph in R

i have a data set with parameter_variations and a score. This score has four scales: like, anth, comf and ueq.

↧

Error system is computationally singular when using arellano matrix in r

February 20, 2020, 9:41 am

≫ Next: Generating Range Between in Dataframe column

≪ Previous: Aggregate data and plot in a bar plot in R

I'm running some regressions using plm package, but when I try to correct for heteroskedasticity and serial autocorrelation using arellano matrix, the system returns a computationally singular error. I've looked for it in other posts, where people say that it's due to multicollinearity. However, even when I exclude the variables that are highly predictable by others in the model, the error is still returned.

mod_1_within <- plm(PIB ~ Cred + pop + prod + op + desoc + Esc_15 + RT + DT I(DT*Gini), data = dd, effect = 'individual')
summary (mod_1_within)

arellano_matrix_within_1 <- vcovHC(mod_1_within, method = 'arellano')
coeftest(mod_1_within, arellano_matrix_within_1)

The regressions runs, but when I run the arellano_matrix part it returns

Error in solve.default(crossprod(demX)) :
system is computationally singular: reciprocal conditional number = 1.74385e-23

Does someone have an idea how to fix it? Thanks in advance!

↧

Generating Range Between in Dataframe column

February 20, 2020, 9:42 am

≫ Next: Degrees of freedom from multiple regression output (r)

≪ Previous: Error system is computationally singular when using arellano matrix in r

Im generating a data frame in R studio. with 4 columns ITA,Probablity,Cummulative Probablity & Range

My code is working good

InterArrivalInput <- list(InterArrival = c(1,2,3,4),
                      Probability = c(0.25,0.40,0.20,0.15))
countDF <- function(input) {


Cumulative <- cumsum(input$Probability)
Range <- (Cumulative * 100) 

df <- data.frame(InterArrivals = input$InterArrival,
               Probability = input$Probability,
               Cumulative = Cumulative,
               Range = Range
}

currently its calculating range for e.g 25 for 0.25 cumulative probablity.

Commulative | Range
0.25        | 25
0.65        | 55

How can i generate range column as

Commulative | Range
0.25        | 0 - 25
0.65        | 26 - 55

Im starting to learn R language. Don't know if its possible or not. Thankyou

↧

Degrees of freedom from multiple regression output (r)

February 20, 2020, 9:42 am

≫ Next: How to sum numbers considered as strings?

≪ Previous: Generating Range Between in Dataframe column

I want to report the model's F-test using APA format: (F(x,x) = XX.XX, p=XXX) (https://www.slideshare.net/plummer48/reporting-a-multiple-linear-regression-in-apa)

Using the below output from a sample of N=725: F(2, x ) = 1938, p<2.2e-16.

But where do we derive the x degrees of freedom from this R output? Is the 722 the df of error/residual? Unlike the SPSS example linked above, the two values for DF don't sum to the N.

↧

How to sum numbers considered as strings?

February 20, 2020, 9:43 am

≫ Next: Jupyter R Notebook Error in View(): 'View()' not yet supported in the Jupyter R kernel

≪ Previous: Degrees of freedom from multiple regression output (r)

I have a column in my dataset like this:

col1
1
1, 1, 1, 1
1, 1
1, 1, 1, 1, 1
1

I am trying to sum each row in a new column like this output:

col2
1
4
2
5
1

I have tried doing:

rowSums(as.numeric(as.character(df$col1)))
Error in rowSums(as.numeric(as.character(df$col1))) : 
  'x' must be an array of at least two dimensions
In addition: Warning message:
In is.data.frame(x) : NAs introduced by coercion

I am new to R and likely missing something obvious, but I can't find any similar problems online also in R to adapt to my data, any help or advice on what functions to use would be appreciated.

Data:

structure(list(col1 = c("1", "1, 1, 1, 1", "1, 1", "1, 1, 1, 1, 1", "1"), 
row.names = c(NA, -5L), class = "data.frame")

↧

Jupyter R Notebook Error in View(): 'View()' not yet supported in the Jupyter R kernel

February 20, 2020, 9:43 am

≫ Next: How can I make my optimisation with ROI less sensitive to starting values?

≪ Previous: How to sum numbers considered as strings?

I am trying to view and edit dataframes using JupyteR R kernel Notebooks. When I go to use very basic R dataframe display and editing commands in a JupyteR notebook, I get a "'edit()' not yet supported in the Jupyter R kernel" error.

A top Google search turns up a JupyteR IRkernel source code reference from April 2019 that says:

"we simply have currently no way to view or edit dfs:

https://github.com/IRkernel/IRkernel/issues/280"

add_to_user_searchpath('View', function(...) {
    stop(sQuote('View()'), ' not yet supported in the Jupyter R kernel') }) 
add_to_user_searchpath('edit', function(...) {
    stop(sQuote('edit()'), ' not yet supported in the Jupyter R kernel') })

Perhaps I installed something incorrectly, but I don't think so.

Here's the version information:

Server Information: You are using Jupyter notebook. The version of the notebook server is: 6.0.3 The server is running on this version of Python: Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
Current Kernel Information: R version 3.6.2 (2019-12-12)

Okay, so I found a workaround from January 2017 use the head() function https://github.com/IRkernel/IRkernel/issues/280. I still need a workaround or direct support for the R edit() function for R dataframes in JupyteR.

I apologize if my original question was an eR-Rant post. I thought that since the R in JupyteR stands for "R" support, but the R community is far behind the JU-PYT community in supporting R. I spent hours trying to use JupyteR basic R commands to display and edit dataframes, e.g., View, Edit, using data entry forms, and similar tasks that are easy with Python ipywidgets. After frustration, it feels like R in "JupyteR" stands for Rough, Ragged, Rigamarole. The functions View and Edit work okay in RStudio and RShiny. I realize there are few free hours for R volunteer developers to spend, and so many R platforms to choose from (R, RGui, RStudio, RShiny, JupyteR, and emerging R IDEs). Skilled R volunteer developers have to make wise choices about projects in which to invest their time.

RStudio is not an option for my non-technical users and RShiny is not something the IT support group really wants to take on. Jupyter is the target because it meets many more technical and business (support) requirements.

Does anyone know a fix or a workaround to get these basic View() and edit() dataframe functions to work in JupyteR?

What are the reasons why View() and edit() do not work in JupyteR? The JavaScript display functions must exchange the edited dataframe with these two R functions. What makes things non-portable for a function wrapper to run these in Jupyter versus RShiny? Is it because the update event in the JupyteR cell must fire and update only the local cell display (like Ajax), and cannot refresh the entire page? I need to learn more about how Shiny and Jupyter do their reactive callback bits.

I am now trying Jupyter Dash and Plotly open source as a alternative. It appears to be much more Jupyter friendly with wrapping existing react.js components. I will report back after some more testing.

↧

How can I make my optimisation with ROI less sensitive to starting values?

February 20, 2020, 9:48 am

≫ Next: string abbreviation creating dublicates

≪ Previous: Jupyter R Notebook Error in View(): 'View()' not yet supported in the Jupyter R kernel

I'm fairly new to optimisation and I'm struggling to understand a couple of things about R's ROI package. There are two issues: (1) I have a solution that seems very sensitive to start values, and I would like to remove this. (2) Ideally, I'd like to adjust my objective function to not only solve the primary problem, but do so in the most cost-effective way (I currently simple specify a budget, and if the solution is within budget will accept it, even if cheaper solutions are possible).

The code below should run fine, but the sensitivity to starting values is clear from the two solutions s1 (nearly perfect) and s2 (rubbish). The secondary issue I would only like to consider if there is an optimal solution within my budget - ie, I'm not trying to balance costs and benefits if the cost is below the threshold, which I appreciate adds a further non-linearity to the problem and make things things impossible. Any guidance welcome!

Thanks

library(ROI)
library(nloptr)
library(ROI.plugin.nloptr)

### define constants:
B <- 8000      ## total budget available
c1 = -0.15      ## growth rate at zero spend
c2 = 0.001      ## improvement in growth rate per unit spend

dum.dat1 <- data.frame(os = c(10, 100, 1000, 10, 200),
                      ne = c(10, 10, 200, 200, 200),
                      me = c(100, 200, 300, 300, 1000))

Nsites <- NROW(dum.dat1)   ## number of rows


## define (non-linear) function for optimisation:

eg <- function(ns = runif(Nsites*2, 0, 50)) {   
  ns.mat <- matrix(ns, ncol = Nsites)
  dat.vec <- unlist(dum.dat1)
  dat.arr <- array(c(ns, rep(dat.vec, each = NROW(ns.mat))), 
                   dim = c(NROW(ns.mat), NROW(dum.dat1), ncol(dum.dat1)+1))
  basic.fun <- function(pars) round(apply(cbind(pars[,4], pars[,3] * exp(c1 + (pars[,2] + pars[,1]) * c2)), 1, min))
  result <- -1 * colSums(apply(dat.arr, 1, basic.fun))
  return(result)
}

test.fun <- function(ns = rep(2000, 5)) {   
  ns.mat <- matrix(ns, ncol = Nsites)
  dat.vec <- unlist(dum.dat1)
  dat.arr <- array(c(ns, rep(dat.vec, each = NROW(ns.mat))), 
                   dim = c(NROW(ns.mat), NROW(dum.dat1), ncol(dum.dat1)+1))
  basic.fun <- function(pars) round(apply(cbind(pars[,4], pars[,3] * exp(c1 + (pars[,2] + pars[,1]) * c2)), 1, min))
  result <- apply(dat.arr, 1, basic.fun)
  return(result)
}

## objective function:
F_ob <- F_objective(eg, Nsites)

## constraints:
cons <- L_constraint(
  L = rbind(matrix(rep(1, Nsites),  # first constraint, sum of costs at all sites
                   ncol = Nsites)),
  dir = rep("<=", 1),  # just one constraint
  rhs = c(B))

## define upper boundaries (though probably not needed):
bound <- V_bound( ui = 1:5,  ub = rep(B, Nsites), nobj = Nsites)

nlmp <- OP(objective = F_ob,  # Not the right objective yet - need to do more complex function?
           constraint = cons,
           bounds = bound,
           maximum = FALSE)

ROI_applicable_solvers(nlmp)

## define starting parameters
start1 <- rep(B / Nsites, Nsites)
start2 <- rep(10, Nsites)

## solve it:
s1 <- ROI_solve(nlmp, solver = "nloptr.cobyla", start = start1)
s2 <- ROI_solve(nlmp, solver = "nloptr.cobyla", start = start2)

# what is optimal solution:
solution(s1)  ## almost works, but more costly than necessary 
solution(s2)  ## no good at alll

## check this makes sense:
test.fun(solution(s1))   
test.fun(solution(s2))   ## not good
test.fun(c(2500, 3100, 10, 550, 1700))  ## This is pretty much the real optimal solution

↧