Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all 209367 articles
Browse latest View live

Is there a way to styling (e.g. fonts) when using widgetframe to embed a plotly chart in R?

$
0
0

I've created a chart in R using plotly, and I'm attempting to embed it in my Hugo blog. I was unable to actually embed the chart until I found the widgetframe package (using the framewidget() function, however I now realized that widgetframe strips the font styling that I used within the chart (the chart looks fine on my computer, but if I look at the site from a machine where the font isn't installed, the font is reverted to Times New Roman). Is there a way to keep custom styling while using framewidget()?

For reference, the font I'm using is Lekton, which can be found on Google Fonts. I also use the font for the Hugo site itself, and it is displayed just fine on every other page of the site.


Using which and ! functions in R

$
0
0
x <- c("a", "b", "c", "d", "e", "f", "g")
y <- c("a", "c", "d", "z")

I am trying to compare y to x and find an index where in y that does not match with anything in x. in this case z does match and I want R to return the index of z.

This is one of the things I tried and it does not work.

index <- which(y != x)

ReportingTools in an Analysis of RNA-seq Data: GO analysis using GOstats gives an error

$
0
0

I just followed the example given on page 5 of http://bioconductor.org/packages/release/bioc/vignettes/ReportingTools/inst/doc/rnaseqAnalysis.pdf

When executing le command line at the end of the page:
publish(goResults, goReport, selectedIDs=selectedIDs, annotation.db="org.Mm.eg", pvalueCutoff= 0.05)

I get the error:
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'keys' for signature '"character"'

Previous examples of the PDF are working well.

I am using RStudio 1.1.463, R version 3.5.3 (2019-03-11) , on MacOS 10.14.6
Any help or guidance would be appreciated

R Shiny - saving values of function in data table after action button press

$
0
0

I am trying to create a way to track opening and closing of the breathing organ (i.e. a mouth) of several animals at the same time over the course of 45 minutes. The goal is to be able to calculate the total open time and frequency of opening for each animal. Basically, the idea is to have several stopwatches operating in parallel, while tracking two lists of values per animal: open time and close time.

The experiment would ideally go like this: I start the experiment and therefore the stopwatch. Every time animal 1 opens its breathing organ, I press open, and once it closes its breathing organ, I press close. The time of each, relative to the stopwatch started at the beginning of the experiment, are recorded in a dataframe for animal 1. This process repeats 10-15 times throughout 45 minutes. At the same time, another animal is opening and closing its breathing organ, and a separate dataframe for animal 2 is created using a different set of buttons. I would like to have this be possible for up to 10 animals simultaneously.

I have been able to make the stopwatches (example code below) using a watch function, as well as include action buttons that output text corresponding to the difference in system time between start time of the experiment and time of pressing the open or close buttons. However, I am unsure of how to store these values in a dataframe for each animal.

I have looked around stackoverflow and found nothing that works, including this thread: r Shiny action button and data table output and this one: Add values to a reactive table in shiny

Let me know if you need any more info! Thanks in advance.

library(lubridate)
library(shiny)
library(DT)

# stopwatch function ----

stop_watch = function() {
  start_time = stop_time = open_time = close_time = NULL
  start = function() start_time <<- Sys.time()
  stop = function() {
    stop_time <<- Sys.time()
    as.numeric(difftime(stop_time, start_time))
  }
  open = function() {
    open_time <<- Sys.time()
    as.numeric(difftime(open_time, start_time))
  }
  close = function() {
    close_time <<- Sys.time()
    as.numeric(difftime(close_time, start_time))
  }
  list(start=start, open=open, close=close, stop=stop)
}
watch = stop_watch()

# ui ----

ui <- fluidPage(
  titlePanel("Lymnaea stopwatch"),

  sidebarLayout(
    sidebarPanel(

      selectInput(
        "select",
        label = "Number of animals",
        choices = c(1,2,3,4,5,6,7,8,9,10),
        selected = c("1")
      )
  # action button conditionals ----      
    ),
    mainPanel(
      h4("Start/Stop Experiment:"),
      actionButton('start1',"Start"),
      actionButton('stop1', "Stop"),
      textOutput('initial1'),
      textOutput('start1'),
      textOutput('stop1'),
      textOutput('stoptime1'),

     conditionalPanel(
       h4("Animal 1"),
      condition = "input.select == '1'||input.select == '2'||input.select == '3'||input.select == '4'||input.select == '5'||input.select == '6'||input.select == '7'||input.select == '8'||input.select == '9'||input.select == '10'",
       actionButton('open1', "Open"),
       actionButton('close1', "Close"),
       textOutput('open1'),
       textOutput('opentime1'),
       textOutput('close1'),
       textOutput('closetime1'),

     )

  )
)
)

# server ----

server <- function(input, output, session) {

values <- reactiveValues()

values$df <- data.frame(colnames(c("Open", "Close")))

newEntry <- observe({
  if(input$open1 > 0) {
    newLine <- isolate(c(({watch$start()})))
    isolate(values$df <- rbind(values$df, newLine))
  }
})

output$table <- renderTable({values$df})

  # n = 1 animal  ----
  observeEvent(input$start1, {
    watch$start()
    output$initial1 <- renderText(
      "Timer started."
      )
  })

  observeEvent(input$open1, {
    watch$open()
    output$open1 <- renderText(
      "Time of opening:"
    )
    output$opentime1 <- renderText({
      watch$open()
    })
 })

  observeEvent(input$close1, {
    watch$close()
    output$close1 <- renderText({
      "Time of closing:"
    })
    output$closetime1 <- renderText({
      watch$close()
    })
  })

}

shinyApp(ui, server)

Add double quotation mark to first word of each line

$
0
0

I have an R file of results, such as below:

filename totalvar result runtime
file1  100 0 20.45
file2  400 4 4.50  
...
filen  200 1 2.00

Some of the file contain weird characters so I have to add quotations to it. What is the easiest way to use VIM for adding quotation marks to first word of each line? Something like

filename totalvar result runtime
"file1"  100 0 20.45
"file2"  400 4 4.50  
...
"filen"  200 1 2.00

How to compare a number in one column with 0? [duplicate]

$
0
0

For example, I have a column that contains 1000 numbers, and I want each of them to compare with 0, if it is larger than 0, keep the original number, or using 0 instead, then get a new column.

lmertest::step() not accounting for possible interactions amoung fixed effects in full model

$
0
0

I am attempting to use the step() function in lmertest to generate the most plausible model based on my dataset for a particular response variable. In this case my full model looks like this

smi.length.full <- lmer(smi_tuk ~ exp.time + expgroup + length + mortality.type + (1|exp.call) + (1|cap.location), data = df.clin.week.snake)

When I run step() on this model it generates the following

Backward reduced random-effect table:

                   Eliminated npar logLik     AIC     LRT Df Pr(>Chisq)    
<none>                           8 190.10 -364.20                          
(1 | cap.location)          1    7 189.77 -365.54  0.6584  1     0.4171    
(1 | exp.call)              0    6 178.54 -345.08 22.4557  1  2.151e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Backward reduced fixed-effect table:
Degrees of freedom method: Satterthwaite 

               Eliminated   Sum Sq  Mean Sq NumDF   DenDF F value    Pr(>F)    
expgroup                1 0.000005 0.000005     1   6.600  0.0019 0.9662526    
mortality.type          2 0.000884 0.000884     1  11.856  0.3572 0.5613209    
exp.time                0 0.028747 0.028747     1 128.312 11.5323 0.0009109 ***
length                  0 0.046541 0.046541     1  21.425 18.6708 0.0002900 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Model found:
smi_tuk ~ exp.time + length + (1 | exp.call)

This all appears fine however I know from the generation and selection of a series of other models by hand that an interaction between experimental time and experimental group should really be included in the model step is generating.

Does the step() function not test for interactions unless you specify that you would like it do so with the case of any two particular columns? If so, is there a way to speed the process of designing complicated models using this method to ensure that potentially significant interaction terms are not missed?

R/Crosstalk: how can one dot in plotly scatterplot refer to two points in leaflet map?

$
0
0

I have a tibble containg data for a scatterplot and a map which should interact via crosstalk. Each dot in the scatterplot (x=distance, y=diff_fact) refers to to a pair of two locations (unique 'pair_indicator', school_1_lat, school_1_long, school_2_lat, school_2_long).

head(df_main)

#> # A tibble: 6 x 7
#>   pair_indicator school_1_lat school_2_lat school_1_long school_2_long distance
#>   <chr>                 <dbl>        <dbl>         <dbl>         <dbl>    <dbl>
#> 1 902021-902071          48.2         48.2          16.4          16.4     475.
#> 2 902031-902071          48.2         48.2          16.4          16.4     379.
#> 3 902031-902091          48.2         48.2          16.4          16.4     204.
#> 4 902081-902101          48.2         48.2          16.4          16.4     396.
#> 5 902081-902241          48.2         48.2          16.4          16.4     317.
#> 6 902101-902241          48.2         48.2          16.4          16.4     390.
#> # ... with 1 more variable: diff_factor <dbl>

The idea is that when I click on one dot in the scatter plot, both locations of the pertaining pair are highlighted in the leaflet map via crosstalk. Unfortuantely, the code below doesn't work. Selecting a dot in the scatter plot has no effect. Interestingly, if I modify the map and show only one location (omitting the other part of the pair), the selection process works as intended.

Any idea how i can highlight both locations? Many thanks.

library(crosstalk)
library(plotly)
library(leaflet)
library(leaflet.extras)

df_main <- structure(list(pair_indicator = c(
  "902021-902071", "902031-902071",
  "902031-902091", "902081-902101", "902081-902241", "902101-902241"
), school_1_lat = c(
  48.2149026, 48.2221631, 48.2221631, 48.2205238,
  48.2205238, 48.223532
), school_2_lat = c(
  48.2186668, 48.2186668,
  48.22407, 48.223532, 48.2231059, 48.2231059
), school_1_long = c(
  16.3876769,
  16.3848147, 16.3848147, 16.4049726, 16.4049726, 16.402895
), school_2_long = c(
  16.3853998,
  16.3853998, 16.3847968, 16.402895, 16.4063758, 16.4063758
), distance = c(
  475.2529684845,
  379.0377208616, 203.6641400739, 395.8339316511, 316.9061306841,
  390.1434258679
), diff_factor = c(
  1.96024763767, 2.27963849016,
  2.27354570637, 1.3, 1.57575757576, 1.21212121212
)), class = c(
  "tbl_df",
  "tbl", "data.frame"
), row.names = c(NA, -6L))

shared_df_main <- SharedData$new(df_main,
  # key=~pair_indicator,
  group = "pair_indicator"
) # creates shared object

# > scatter plotly --------------------------------------------------------
library(plotly)
plot_distance_diff <-
  plot_ly(
    data = shared_df_main,
    mode = "markers",
    type = "scatter"
  ) %>%
  add_trace(
    marker = list(color = "blue"),
    x = ~distance,
    y = ~diff_factor,
    showlegend = F
  )

# > map -------------------------------------------------------------------
# >> leaflet --------------------------------------------------------------
cross_map <- leaflet(
  data = shared_df_main,
  width = "100%",
  height = 400
) %>%
  addCircles(
    lat = ~school_1_lat,
    lng = ~school_1_long,
    # layerId = ~pair_indicator,
    color = "#008000"
  ) %>%
  addCircles(
    lat = ~school_2_lat,
    lng = ~school_2_long,
    # layerId = ~pair_indicator,
    color = "#FF0000"
  ) %>%
  addProviderTiles(providers$Stamen.TonerLite)

# > combine graphs -----------------------------------------------------------
crosstalk::bscols(plot_distance_diff, cross_map)

# Works; using only one location -------------------------------------------------------------------
cross_map_1 <- leaflet(
  data = shared_df_main,
  width = "100%",
  height = 400
) %>%
  addCircles(
    lat = ~school_1_lat,
    lng = ~school_1_long,
    # layerId = ~pair_indicator,
    color = "#008000"
  ) %>%
  addProviderTiles(providers$Stamen.TonerLite)

# > combine graphs -----------------------------------------------------------
crosstalk::bscols(plot_distance_diff, cross_map_1)




How to make two legends using ggplot 2?

$
0
0

I am rather new to plotting graphs in R. I'm trying to create two legends similar to the picture below. The picture below is of data from the previous year and I'm updating it but need to rewrite the code. So far my data looks identical to the bottom one (with updated data) except that the horizontal lines are not showing up in the legend. However, I need to make two legends as shown below but only have the Year part set up and am getting confused how to change both the name, line type and color of the lines while separating them into a separate legend from the Year Group.

Example of my data:

# Groups:   Year [6]
Year  Month  Temp
<fct> <ord> <dbl>
1 2014  Mar    14.9
2 2014  Apr    16.6
3 2014  May    20.5
4 2014  Jun    22.0
5 2014  Jul    23.9
6 2014  Aug    24.3
7 2014  Sep    24.4
8 2014  Oct    22.1
9 2014  Dec    13.6
10 2015  Jan    11.5
# ... with 46 more rows

My code so far:

 ggplot(wq4, aes(x=Month,y=Temp, color = Year)) +
 geom_line(aes(color = Year, group = Year)) + 
 labs(y = "Average Temperature (C)") +
 geom_hline(aes(yintercept = 5), color = "red", linetype="dashed") +
 geom_hline(aes(yintercept = 15), color = "dark green", linetype="dashed") 

enter image description here

Analyzing homogeneity of variances in R with Residuals vs Fitted plot

$
0
0

I am fairly new to R and I have just performed a nested ANOVA on my data. I am trying to plot a residuals versus fitted values plot with this. Below is my code and my plot received. Was this the proper way about doing this and if so, can I assume equal variances of my two sites based on my results?

Code:

plot(Q1_data.lme,main="Residuals versus Fitted Values Plot")

Picture of residual versus fitted plot

R dplyr function putting mutate, top_frac and ifelse together

$
0
0

Im looking for ways to mutate a new column to assign the top and bottom 20% of values using dplyr.

Here is my code and it isnt working well for me.

DF1 <- DF %>%
  group_by(Timepoint) %>%
  filter (!is.na (log2_Concentration)) %>%
  arrange (desc(log2_Concentration)) %>%
  mutate (top_bottom=ifelse (log2_Concentration=top_frac(.2), "TOP20PERC",
          ifelse (log2_Concentration=top_frac(-.2), "BOTTOM20PERC", "MID")))

My hope is to assign per timepoint, the top 20%, bottom 20% and the rest as MID so I can either color these points in my ggplot.

Thanks a lot gurus!

R: getting hex colors from numeric values - how to define midpoint in gradient scale

$
0
0

I have a numeric vector and I would like to convert it to hex color codes. The colors should follow a gradient distribution from its possible minimum (red; 0), via a mid value which I define (the mean, black), to its possible max (green; 1).

With ggplot I would use the scale_*_gradientn function. But now I need the actual hex values, and I am struggling to calculate them.

library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.6.2

data <- data.frame("a"=runif(100),
                   "b"=runif(100))


# ggplot example ----------------------------------------------------------

data <- data.frame("a"=runif(100),
                   "b"=runif(100))

mean_a <- mean(data$a)

ggplot(data)+
  geom_point(aes(x=a,
                 y=b,
                 color=a),
             stat="identity")+
  scale_color_gradientn(colors=c("red","black","green"),
                        values=c(0, mean_a, 1))+
  theme(legend.position = NULL)

Mapping the scale_color_gradientn function is apparently not the way forward:

data %>% 
  mutate(color_values=map(a, scale_color_gradientn, 
                          colors=c("red","black","green"),
                          values=c(0, mean_a, 1))) %>% 
  head()
#>           a         b                      color_values
#> 1 0.2863037 0.9902960 <environment: 0x000000001d002f30>
#> 2 0.6169960 0.9527580 <environment: 0x000000001d038798>
#> 3 0.3126825 0.8807853 <environment: 0x000000001d06e098>
#> 4 0.5464037 0.2307841 <environment: 0x000000001d0a39a8>
#> 5 0.5162976 0.8147066 <environment: 0x000000001d0d92a8>
#> 6 0.7519632 0.6821084 <environment: 0x000000001d10cc98>

Created on 2020-02-17 by the reprex package (v0.3.0)

I came across this SO entry on the colorRamp function, however, it seems that it does not provide any option to define a manual 'mid' point.

I also came accross this post on colorspace package, which allows for the definition of a midpoint. However, again, I fail to apply it outside of ggplot.

Grateful for any hint!

How can I create a timeline of multiple subjects without using dates in R?

$
0
0

I have a dataset of subjects that I want to compare based on Days After Treatment rather than the actual dates of their treatments and follow ups.

My thought process is that it would be way easier to visualize if all the subjects would start at the same point and end at the same point rather than be spread across multiple years due to subjects starting at different times.

Is there a way to do this in R? I've looked at vistime, which looks promising except that start/end are supposed to be dates.

Here's an example of what my data looks like:

df <- data.frame(Patient = c(1,1,1,2,2,2,3,3,3),
                 Response = c("PR", "CR", "CR", "SD", "SD", "PD", "PR", "PR", "CR"),
                 Start = rep(c("Day 30", "Day 90", "Day 180")),
                 End = rep(c("Day 90", "Day 180", "Day 270")))

Example Data

Error installing RCurl and XML packages on Windows

$
0
0

I am trying to install the RCurl and XML packages and getting an error on Windows. I have tried R 2.15.0 and 2.15.1, cran.r-project.org and www.omegahat.org/R, and binary and source. Any suggestions? Thanks.

install.packages('RCurl',repos='http://www.omegahat.org/R', type='source')
Installing package(s) into ‘C:/R/R-2.15.0/library’
(as ‘lib’ is unspecified)
trying URL 'http://www.omegahat.org/R/src/contrib/RCurl_1.95-1.tar.gz'
Content type 'application/x-gzip' length 862526 bytes (842 Kb)
opened URL
downloaded 842 Kb

* installing *source* package 'RCurl' ...
Please set LIB_CURL
cygwin warning:
  MS-DOS style path detected: C:/R/R-2.15.0/library/RCurl/libs
  Preferred POSIX equivalent is: /cygdrive/c/R/R-2.15.0/library/RCurl/libs
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
** libs
Warning: this package has a non-empty 'configure.win' file,
so building only the main architecture

cygwin warning:
  MS-DOS style path detected: C:/R/R-215~1.0/etc/i386/Makeconf
  Preferred POSIX equivalent is: /cygdrive/c/R/R-215~1.0/etc/i386/Makeconf
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
gcc  -I"C:/R/R-215~1.0/include" -DNDEBUG -Wall -I/include -DHAVE_LIBIDN_FIELD=1 -DHAVE_CURLOPT_URL=1 -DHAVE_CURLINFO_EFFECTIVE_URL=1 -DHAVE_CURLINFO_RESPONSE_CODE=1 -DHAVE_CURLINFO_TOTAL_TIME=1 -DHAVE_CURLINFO_NAMELOOKUP_TIME=1 -DHAVE_CURLINFO_CONNECT_TIME=1 -DHAVE_CURLINFO_PRETRANSFER_TIME=1 -DHAVE_CURLINFO_SIZE_UPLOAD=1 -DHAVE_CURLINFO_SIZE_DOWNLOAD=1 -DHAVE_CURLINFO_SPEED_DOWNLOAD=1 -DHAVE_CURLINFO_SPEED_UPLOAD=1 -DHAVE_CURLINFO_HEADER_SIZE=1 -DHAVE_CURLINFO_REQUEST_SIZE=1 -DHAVE_CURLINFO_SSL_VERIFYRESULT=1 -DHAVE_CURLINFO_FILETIME=1 -DHAVE_CURLINFO_CONTENT_LENGTH_DOWNLOAD=1 -DHAVE_CURLINFO_CONTENT_LENGTH_UPLOAD=1 -DHAVE_CURLINFO_STARTTRANSFER_TIME=1 -DHAVE_CURLINFO_CONTENT_TYPE=1 -DHAVE_CURLINFO_REDIRECT_TIME=1 -DHAVE_CURLINFO_REDIRECT_COUNT=1 -DHAVE_CURLINFO_PRIVATE=1 -DHAVE_CURLINFO_HTTP_CONNECTCODE=1 -DHAVE_CURLINFO_HTTPAUTH_AVAIL=1 -DHAVE_CURLINFO_PROXYAUTH_AVAIL=1 -DHAVE_CURLINFO_OS_ERRNO=1 -DHAVE_CURLINFO_NUM_CONNECTS=1 -DHAVE_CURLINFO_SSL_ENGINES=1 -DHAVE_CURLINFO_COOKIELIST=1 -DHAVE_CURLINFO_LASTSOCKET=1 -DHAVE_CURLINFO_FTP_ENTRY_PATH=1 -DHAVE_CURLINFO_REDIRECT_URL=1 -DHAVE_CURLINFO_PRIMARY_IP=1 -DHAVE_CURLINFO_APPCONNECT_TIME=1 -DHAVE_CURLINFO_CERTINFO=1 -DHAVE_CURLINFO_CONDITION_UNMET=1 -DHAVE_CURLOPT_KEYPASSWD=1 -DHAVE_CURLOPT_DIRLISTONLY=1 -DHAVE_CURLOPT_APPEND=1 -DHAVE_CURLOPT_KRBLEVEL=1 -DHAVE_CURLOPT_USE_SSL=1 -DHAVE_CURLOPT_TIMEOUT_MS=1 -DHAVE_CURLOPT_CONNECTTIMEOUT_MS=1 -DHAVE_CURLOPT_HTTP_TRANSFER_DECODING=1 -DHAVE_CURLOPT_HTTP_CONTENT_DECODING=1 -DHAVE_CURLOPT_NEW_FILE_PERMS=1 -DHAVE_CURLOPT_NEW_DIRECTORY_PERMS=1 -DHAVE_CURLOPT_POSTREDIR=1 -DHAVE_CURLOPT_OPENSOCKETFUNCTION=1 -DHAVE_CURLOPT_OPENSOCKETDATA=1 -DHAVE_CURLOPT_COPYPOSTFIELDS=1 -DHAVE_CURLOPT_PROXY_TRANSFER_MODE=1 -DHAVE_CURLOPT_SEEKFUNCTION=1 -DHAVE_CURLOPT_SEEKDATA=1 -DHAVE_CURLOPT_CRLFILE=1 -DHAVE_CURLOPT_ISSUERCERT=1 -DHAVE_CURLOPT_ADDRESS_SCOPE=1 -DHAVE_CURLOPT_CERTINFO=1 -DHAVE_CURLOPT_USERNAME=1 -DHAVE_CURLOPT_PASSWORD=1 -DHAVE_CURLOPT_PROXYUSERNAME=1 -DHAVE_CURLOPT_PROXYPASSWORD=1 -DHAVE_CURLOPT_SSH_HOST_PUBLIC_KEY_MD5=1 -DHAVE_CURLOPT_NOPROXY=1 -DHAVE_CURLOPT_TFTP_BLKSIZE=1 -DHAVE_CURLOPT_SOCKS5_GSSAPI_SERVICE=1 -DHAVE_CURLOPT_SOCKS5_GSSAPI_NEC=1 -DHAVE_CURLOPT_PROTOCOLS=1 -DHAVE_CURLOPT_REDIR_PROTOCOLS=1 -DHAVE_CURLOPT_SSH_AUTH_TYPES=1 -DHAVE_CURLOPT_SSH_PUBLIC_KEYFILE=1 -DHAVE_CURLOPT_SSH_PRIVATE_KEYFILE=1 -DHAVE_CURLOPT_FTP_SSL_CCC=1 -DHAVE_CURLOPT_COOKIELIST=1 -DHAVE_CURLOPT_IGNORE_CONTENT_LENGTH=1 -DHAVE_CURLOPT_FTP_SKIP_PASV_IP=1 -DHAVE_CURLOPT_FTP_FILEMETHOD=1 -DHAVE_CURLOPT_LOCALPORT=1 -DHAVE_CURLOPT_LOCALPORTRANGE=1 -DHAVE_CURLOPT_CONNECT_ONLY=1 -DHAVE_CURLOPT_CONV_FROM_NETWORK_FUNCTION=1 -DHAVE_CURLOPT_CONV_TO_NETWORK_FUNCTION=1 -DHAVE_CURLOPT_CONV_FROM_UTF8_FUNCTION=1 -DHAVE_CURLOPT_MAX_SEND_SPEED_LARGE=1 -DHAVE_CURLOPT_MAX_RECV_SPEED_LARGE=1 -DHAVE_CURLOPT_FTP_ALTERNATIVE_TO_USER=1 -DHAVE_CURLOPT_SOCKOPTFUNCTION=1 
-DHAVE_CURLOPT_SOCKOPTDATA=1 -DHAVE_CURLOPT_SSL_SESSIONID_CACHE=1         -O3 -Wall  -std=gnu99 -mtune=core2 -c base64.c -o base64.o
In file included from base64.c:1:0:
Rcurl.h:4:23: fatal error: curl/curl.h: No such file or directory
compilation terminated.
make: *** [base64.o] Error 1
ERROR: compilation failed for package 'RCurl'
* removing 'C:/R/R-2.15.0/library/RCurl'
Warning in q("no", status = 1, runLast = FALSE) :
  cannot get info on 'C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\Rtmp2fO9aA/R.INSTALL4245e223b81/RCurl', reason 'Access is denied'
Warning in install.packages :
  running command 'C:/R/R-215~1.0/bin/i386/R CMD INSTALL -l "C:/R/R-2.15.0/library"   C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\RtmpclrXFX/downloaded_packages/RCurl_1.95-1.tar.gz' had status 1
Warning in install.packages :
  installation of package ‘RCurl’ had non-zero exit status

The downloaded source packages are in

C:\Documents and Settings\Administrator\Local Settings\Temp\RtmpclrXFX\downloaded_packages

convert a character column into numeric in R

$
0
0

I wonder why my conversion of the "t5" column was not successful--

The "t5" column is all characters, I want to convert it into a numeric column, leave non-numeric value as NA, named as "t5.num" in the tibble.

My code below: first of all I assigned the name, then trying to mutate the column, but it did not work--

d <- tibble(id = c(3, 7, 1, 10,100), t5 = c("10", "<1", "NA", "8","78"))
convert_column <- function(data, col_name) {
    new_col_name <- paste0(rlang::enquo(col_name),".num")
    data %>%
        mutate(new_col_name = as.numeric(!!col_name))
     }
d %>% convert_column("t5")

Can someone point out what is wrong with my code? thanks for your help!


probability and classification in svm function of e1071 package in R

$
0
0

I'm using SVM in e1071 package for binary classification. I'm using both the probability attribute, and the SVM predict classification to compare the results. What I'm puzzled by is that the predicted classification (0 or 1) of the predict function doesn't seem congruous with the actual probabilities listed in the attribute. For some very high probabilities for level 1, the SVM classification is level 0, and for some low probabilities for level 1, the SVM classification is level 1.

here's a sample code and results

svm_model <- svm(as.factor(CHURNED) ~ .
                  , scale = FALSE
                  , data = train
                  , cost = 1
                  , gamma = 0.1
                  , kernel = "radial"
                  , probability = TRUE

    )
 test$Pred_Class <- predict(svm_model, test, probability = TRUE)
 test$Pred_Prob <- attr(test$Pred_Class, "probabilities")[,1]

Here is the results: (rows have been placed differently to see various examples)

CHURNED: is response variable that is being predicted

Pred_class: is the predicted class by SVM

Pred_Prob: is the predicted probability, based on which SVM makes classification?

CHURNED Pred_Class  Pred_Prob
    1   0   0.03968526    # --> makes sense
    1   0   0.03968526
    1   0   0.07033222
    1   0   0.11711195
    1   0   0.12477983
    1   0   0.12827296
    1   0   0.12829345
    1   0   0.12829345
    1   0   0.12829345
    1   0   0.12829444
    1   0   0.12829927
    1   0   0.12829927
    1   0   0.12831169
    1   0   0.12831169
    1   0   0.12831428
    1   1   0.13053475   # --> doesn't make sense. Prob is less than 0.5
    1   1   0.13053475
    1   1   0.13053475
    1   1   0.1305348
    1   1   0.1305348
    1   1   0.1305348
    1   1   0.1690807
    1   1   0.2206993
    1   1   0.2321171
    0   0   0.998289      # --> doesn't make sense. Prob is almost 1!
    0   0   0.9982887
    0   0   0.993133
    0   0   0.9898889
    1   0   0.9849951
    0   0   0.9849951
    1   0   0.546427
    0   0   0.5440994    # --> doesn't make sense. Prob is more than 0.5
    0   0   0.5437889
    1   0   0.5417848
    0   0   0.5284112
    0   0   0.5252177
    0   1   0.5180776   # --> makes sense but is not consistent with above example
    0   1   0.5180704
    1   1   0.5180436
    1   1   0.5180436
    0   1   0.518043

This result doesn't make sense to me at all. The predicted class and predicted probabilities don't match. I've checked to make sure that I'm referencing the right column from the "probabilities" attribute matrix:

 test$Pred_Class
  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [98] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
attr(,"probabilities")
             1         0
6442 0.2369796 0.7630204
6443 0.2520246 0.7479754
6513 0.2322581 0.7677419
6801 0.2309437 0.7690563
6802 0.2244768 0.7755232
6954 0.2322450 0.7677550
6968 0.2537544 0.7462456
6989 0.2352477 0.7647523
7072 0.2322308 0.7677692
...
...
...

Maybe I am interpreting the probability incorrectly?

Use apply() in R to find the median price per county in the housing data frame

$
0
0

I need to use the s or l apply function to calculate the median price for each county in the housing data frame. County and price are both columns, I don't know how to group based on the county name.

This has probably been asked somewhere, but I've been searching for hours and can't find anything. Sorry, TIA

Using tidyverse to dynamically mutate one variable from one grouped dataset from another dataset

$
0
0

Let's say I work with different classes (nodes, in my dataset) and I have thousands of students. Each student has their own math score, and I need to compare all individual scores with the group mean/sd. To deal with that, I have two different datasets. The first one is "a table". default table

This data frame is formed of several classes (nodes), their means, and their sd.

I also have another dataset composed of students' results, like this one:

students' results

I want to have another dataset in which I get all individual results (i.e., 11, 6, 10, etc) and subtract this result from all means in the first dataset. In the future, it will be needed to check all results and all nodes together.

In other words, from the first nome (number 12 in the image), I will subtract 11 (student's result) from 68 (mean result), 6 (student's result) from 68 (mean), 10 (student's result) from 68 (mean), etc. Then I'll move for the second node (number 7 in the image), and I will do the same (subtract 11 from 74 (mean result), 6 from 74 (mean result), 10 from 74),

The final output I would like to have: Final output

I searched for other questions but I did not find any solution. Any help is valuable. I use tidyverse, and I would like to remain within tidyverse environment. Thank you To reproduce:

> dput(default_table)
structure(list(node = structure(c(6L, 3L, 5L, 1L, 2L, 4L, 7L), .Label = c("4", 
"5", "7", "8", "10", "12", "13"), class = "factor"), t_mean = c(68.8219178082192, 
74.3260869565217, 83.0178571428571, 92.2108108108108, 98.3304347826087, 
88.6111111111111, 48.4), t_sd = c(14.4351088961341, 16.9448394654941, 
13.0272663858681, 12.2011483603603, 12.1775472144027, 14.5621088567959, 
10.4876948807826), vars = c(1, 1, 1, 1, 1, 1, 1), n = c(121, 
74, 92, 616, 191, 58, 7), mean = c(68, 74.6891891891892, 82.8369565217391, 
91.3944805194805, 97.738219895288, 88.0172413793103, 48.7142857142857
), sd = c(14.0226008048911, 16.1151045250761, 11.0426517498479, 
12.6758935948866, 12.0212336250146, 15.9169901273025, 8.63547500554709
), min = c(32, 32, 58, 36, 56, 44, 39), max = c(97, 113, 104, 
123, 128, 124, 60), range = c(65, 81, 46, 87, 72, 80, 21), se = c(1.27478189135374, 
1.87334284914993, 1.15127602962793, 0.510726307415094, 0.869825937534791, 
2.09000319547951, 3.26390275965596), q0_25 = c(59, 64, 74.75, 
84, 90, 80, 41.5), q0_5 = c(68, 73.5, 81.5, 92, 98, 87, 47), 
    q0_75 = c(80, 87.75, 92.25, 100, 106, 98.75, 56)), class = "data.frame", row.names = c(NA, 
-7L))


test_result <- data.frame(x = rnorm(100,10,2))

Shiny App Date widget- start with blank field

$
0
0

I have a Shiny app date picker widget (dateInput), and I want the starting date to be blank- no date at all. Essentially I want the form to save nothing, or NA unless a user selects a date with the widget. I sort of got this to work by putting value = "" in the code, but I am having problems with a blank date writing to data on the server side, and I get warnings because "" is not in yyyy-mm-dd format. Can you help?

I did see this post: Setting date range picker start date to blank but this is a different widget than I am using. Thank you

`dateInput("coap_injail_infodissemination_date", label = "Information Dissemination - Date Referred:", value = "", format = "mm/dd/yyyy", startview = "decade"),

enter image description here

Installed R packages in Dockerfile won't be found when running container

$
0
0

I'm trying to install several R packages in a Python Docker image. I've this small Dockerfile:

# Python 3.7.5
FROM python:3.7.5
ENV PYTHONUNBUFFERED 1

# Install R 3.6
RUN echo 'deb http://cran.rstudio.com/bin/linux/debian buster-cran35/'>> /etc/apt/sources.list
RUN apt install dirmngr
RUN apt-key adv --keyserver keys.gnupg.net --recv-key 'E19F5F87128899B192B1A2C2AD5F960A256A04AF'
RUN apt update
RUN apt install -y r-base

# Install R dependencies
RUN R -e "install.packages('BiocManager', dependencies=TRUE, repos='http://cran.rstudio.com/')"

It doesn't throw any error. But when I run docker container exec -it <my container> bash and do:

Rscript -e 'installed.packages()' | grep BiocManager

There aren't any results. I don't know if this applies, but during building it throws:

The downloaded source packages are in

'/tmp/Rtmp7jBLWQ/downloaded_packages'

Maybe that it's installing the packages on a temp folder is the problem. Is there any way to install R packages without making an image base on R-base image and use install2.r?

Viewing all 209367 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>