Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all 206673 articles
Browse latest View live

Rank computation considering time stamp in grouped data

$
0
0

In my dataset, I have observations for several game players for several points in time. For each observation, I want to compute a rank for this user based on the number of points compared to the number of points of other users at this point in time. Therefore, it has to compare the points of this user of this observation with the number of points of all other users at their last (or time-wise closest in the past or exactly at the same second) observation of each of the others users.

Example data including the expected rank:

da = data.frame(player = c(1,1,1,2,2,2,3,3,3), date_sec = c(1451665633,1451665693,1451665721,1451665627,1451665692,1451665738,1451665626,1451665684,1451665765), points = c(100,150,200,130,140,230,80,90,100), rank = c(2,1,1,1,2,2,3,3,3))da  player   date_sec points rank1      1 1451665633    100    22      1 1451665693    150    13      1 1451665721    200    14      2 1451665627    130    15      2 1451665692    140    26      2 1451665738    230    27      3 1451665626     80    38      3 1451665684     90    39      3 1451665765    100    3

I know how to rank within groups, but I don't find a way to take the exact point in time into account here. A way would be to extract the day out of the timestamp and to group by day and player, but this is not not as accurate as I would like to have, because the rank can change several times within one day.

library(dplyr)da2 = mutate(da, day = as.Date(as.POSIXct(date_sec, origin="1970-01-01"))) %>%   group_by(player, day) %>%   mutate(my_ranks = order(order(points, day, decreasing=TRUE)))da2A tibble: 9 x 6# Groups:   player, day [3]  player   date_sec points  rank day        my_ranks<dbl>      <dbl>  <dbl> <dbl> <date>        <int>1      1 1451665633    100     2 2016-01-01        32      1 1451665693    150     1 2016-01-01        23      1 1451665721    200     1 2016-01-01        14      2 1451665627    130     1 2016-01-01        35      2 1451665692    140     2 2016-01-01        26      2 1451665738    230     2 2016-01-01        17      3 1451665626     80     3 2016-01-01        38      3 1451665684     90     3 2016-01-01        29      3 1451665765    100     3 2016-01-01        1

Next to the rank, I would also like to have a percentile rank based on the number of points, because there are different numbers of players active in different time periods.

Does anyone have an idea?


How do I fix a summarise function in dplyr package error?

$
0
0

I have some problems summarise function in "dplyr" package. This is the code.

library("dplyr")a <- read.csv("Number of subway passengers.csv",header = T, stringsAsFactor = F)a <- a[,c(-2,-3,-4,-5)]colnames(a)=c("Date","4-5","5-6","6-7","7-8","8-9","9-10","10-11","11-12","12-13","13-14","14-15","15- 16","16-17","17-18","18-19","19-20","20-21","21-22","22-23","23-24","0-1","1-2","2-3","3-4","Total")b <- summarise(a,mean_passenger=mean("Total",na.rm=TRUE)) 

After running the last code I have some error in summarise.

In mean.default("Total", na.rm = TRUE) : argument is not numeric or logical:returning NA

Why does this error occur?

I attach the result of using the function str.

> str(a)'data.frame':   16501 obs. of  26 variables: $ Date        : chr  "2019-11-01""2019-11-01""2019-11-01""2019-11-01" ... $ 4-5         : int  32 2 3 0 5 0 11 1 2 0 ... $ 5-6         : int  438 353 89 182 143 211 187 127 83 175 ... $ 6-7         : int  529 2019 152 852 161 1078 154 477 115 622 ... $ 7-8         : int  1612 4520 289 2926 288 4395 302 1044 219 1817 ... $ 8-9         : int  3405 9906 435 9348 482 13000 386 3662 366 5234 ... $ 9-10        : int  2360 6525 481 4124 631 6669 550 3510 494 3292 ... $ 10-11       : int  2377 3571 716 2064 768 2964 841 2593 843 2292 ... $ 11-12       : int  2853 2951 1090 1889 1359 2501 1686 2813 1262 2349 ... $ 12-13       : int  3334 3190 1073 1538 1531 2127 1781 2646 1583 2160 ... $ 13-14       : int  3545 3348 1367 1751 1937 2108 2059 2718 1868 2159 ... $ 14-15       : int  2850 3179 1782 1403 2466 1926 2405 2579 2303 2071 ... $ 15-16       : int  4606 3265 2235 1431 2821 1718 3125 2103 2479 1559 ...  $ 16-17       : int  4915 3575 2345 1218 3403 1778 3241 2010 2656 1777 ... $ 17-18       : int  7472 4191 3627 1249 5807 2396 3796 2033 3583 1599 ... $ 18-19       : int  11107 5445 7462 1486 10738 3746 4836 2582 5246 1776 ... $ 19-20       : int  5754 3882 2943 816 4680 2557 3192 1682 2709 1261 ... $ 20-21       : int  3920 2596 2249 439 3670 935 2107 675 1782 548 ... $ 21-22       : int  3799 2177 2199 288 4495 510 2452 512 1565 341 ... $ 22-23       : int  3369 1624 1460 296 4118 384 2407 380 1094 260 ... $ 23-24       : int  1678 912 640 202 2366 299 1394 323 596 153 ... $ 0-1         : int  228 478 62 47 271 75 236 143 66 73 ... $ 1-2         : int  2 39 0 1 1 0 6 10 1 1 ... $ 2-3         : int  0 0 0 0 0 0 0 0 0 0 ... $ 3-4         : int  0 0 0 0 0 0 0 0 0 0 ... $ Total       : int  70185 67748 32699 33550 52141 51377 37154 34623 30915 31519 ...

R - group data in dataframe by two columns and rearrange into new dataframe

$
0
0

I have a dataframe foo:

   A  B  C1  x  m  22  x  n  33  x  o  114  y  m  55  y  o  8

Edit: thanks to @akrun for reminding me, I add the code to produce this df:

foo <- structure(list(A = c("x", "x", "x", "y", "y"),        B = c("m", "n", "o", "m", "o"), C = c(2L, 3L, 11L, 5L, 8L)),        class = "data.frame", row.names = c("1", "2", "3", "4", "5"))

and from it I want to create a new data frame, with the values in A as row names, the values in B as column names, and as entries the values in C, like so:

   m  n  ox  2  3  11y  5  NA 8

So far I found the aggregate() function

aggregate(.~A, foo, FUN=toString)

which gives me sth like this (it converts the letters in B to numbers though) and is not what I need here:

   A  B      C1  x  1,2,3  2,3,112  y  1,3    5,8

I found answers on SO suggesting the dplyr package and group_by for similar problems, but I couldn't wrap my head around how to adapt this to my problem. Sorry if I overlooked something, thanks for any suggestions!

Using values from slider in javascript DataTables calculations

$
0
0

I've got a nested DataTable in my shiny app that is made from the data below. There are two sliders that I have which make up a percentage of 100. If one slider is 50 the other sider is 50. These two numbers from the two sliders help make up the Spot:30(%) and the Spot:15(%) columns of the child table.

There is another column, Mix(%), where the user is able to go in and edit the numbers. When the user edits this column the numbers in the Spot:30(%) and the Spot:15(%) columns are suppose to be updated accordingly.

The equations are:

Spot:30(%) = (Mix(%) * slider_value1)/100

Spot:15(%) = (Mix(%) * slider_value2)/100

Is there a way to use slider values from the Shiny app in the JS callback script to update the Spot:30(%) and the Spot:15(%) columns when the Mix(%) column is edited by the user??

I've attempted to try a solution like this one, example1, as well as trying to follow this, communicating with shiny via javascript, but can't seem to wrap my head around this.

Data

Parentstructure(list(Market = c("ABILENE-SWEETWATER", "ALBANY-SCHENECTADY-TROY, NY"), `Gross CPP` = c("$0", "$0"), `Gross CPM` = c("$0", "$0"),     `Historical Composite Gross CPP` = c("$0", "$0"), `Historical Composite Gross CPM` = c("$0", "$0")), .Names = c("Market", "Gross CPP", "Gross CPM", "Historical Composite Gross CPP", "Historical Composite Gross CPM"), row.names = c(NA, -2L), class = "data.frame")Childstructure(list(Market = c("ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ABILENE-SWEETWATER", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY", "ALBANY-SCHENECTADY-TROY, NY"), Daypart = c("Daytime", "Early Fringe", "Early Morning", "Early News", "Late Fringe", "Late News", "Prime Access", "Prime Time", "tv_2", "tv_cross_screen", "Daytime", "Early Fringe", "Early Morning", "Early News", "Late Fringe", "Late News", "Prime Access", "Prime Time", "tv_2", "tv_cross_screen"), `Mix (%)` = c(15, 10, 15, 10, 5, 5, 10, 10, 0, 0, 15, 10, 15, 10, 5, 5, 10, 10, 0, 0), `Spot:30 (%)` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `Spot:15 (%)` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `Gross CPP ($)` = c(18, 18, 16, 23, 24, 40, 26, 44, 0, 0, 77, 71, 61, 78, 109, 145, 93, 213, 0, 0), `Gross CPM ($)` = c(0, 0, 0, 0, 0, 0, 0, 0, 23, 13, 0, 0, 0, 0, 0, 0, 0, 0, 23, 13), `Historical Override CPP ($)` = c(18, 18, 16, 23, 24, 40, 26, 44, 0, 0, 77, 71, 61, 78, 109, 145, 93, 213, 0, 0), `Historical Override CPM ($)` = c(0, 0, 0, 0, 0, 0, 0, 0, 23, 13, 0, 0, 0, 0, 0, 0, 0, 0, 23, 13)), .Names = c("Market", "Daypart", "Mix (%)", "Spot:30 (%)", "Spot:15 (%)", "Gross CPP ($)", "Gross CPM ($)", "Historical Override CPP ($)", "Historical Override CPM ($)"), class = "data.frame", row.names = c(NA, -20L))

Code

# The datatable callbackparentRows <- which(Dat[,1] != "")callback_js = JS("function onUpdate(updatedCell, updatedRow, oldValue) {};",  sprintf("var parentRows = [%s];", toString(parentRows-1)),  sprintf("var j0 = %d;", colIdx),"var nrows = table.rows().count();","for(var i=0; i < nrows; ++i){","  if(parentRows.indexOf(i) > -1){","    table.cell(i,j0).nodes().to$().css({cursor: 'pointer'});","  }else{","    table.cell(i,j0).nodes().to$().removeClass('details-control');","  }","}","","// make the table header of the nested table","var format = function(d, childId){","  if(d != null){","    var html = ","'<table class=\"display compact hover\"'+","'style=\"padding-left: 30px;\" id=\"'+ childId +'\"><thead><tr>';","    for(var key in d[d.length-1][0]){","      html += '<th>'+ key +'</th>';","    }","    html += '</tr></thead><tfoot><tr>'","    for(var key in d[d.length-1][0]){","      html += '<th></th>';","    }","    return html +'</tr></tfoot></table>';","  } else {","    return '';","  }","};","","// row callback to style the rows of the child tables","var rowCallback = function(row, dat, displayNum, index){","  if($(row).hasClass('odd')){","    $(row).css('background-color', 'white');","    $(row).hover(function(){","      $(this).css('background-color', 'lightgreen');","    }, function() {","      $(this).css('background-color', 'white');","    });","  } else {","    $(row).css('background-color', 'white');","    $(row).hover(function(){","      $(this).css('background-color', 'lightblue');","    }, function() {","      $(this).css('background-color', 'white');","    });","  }","};","","// header callback to style the header of the child tables","var headerCallback = function(thead, data, start, end, display){","  $('th', thead).css({","'border-top': '3px solid green',","'color': 'black',","'background-color': 'white'","  });","};","","// make the datatable","var format_datatable = function(d, childId, rowIdx){","  // footer callback to display the totals","  // and update the parent row","  var footerCallback = function(tfoot, data, start, end, display){","    $('th', tfoot).css('background-color', '#F5F2F2');","    var api = this.api();","    api.columns().eq(0).each(function(index){","      if(index == 0) return $(api.column(index).footer()).html('Mix Total');","      var coldata = api.column(index).data();","      var total = coldata","          .reduce(function(a, b){return parseFloat(a) + parseFloat(b)}, 0);","      if(index == 3 || index == 4 ||index == 5 || index == 6 || index==7 || index==8) {","        $(api.column(index).footer()).html('');","      } else {","        $(api.column(index).footer()).html(total);","      }","      if(total == 100) {","        $(api.column(index).footer()).css({'color': 'green'});","      } else {","        $(api.column(index).footer()).css({'color': 'red'});","      }","    })","  // update the parent row","    var col_share = api.column(2).data();","    var col_CPP = api.column(5).data();","    var col_CPM = api.column(6).data();","    var col_Historical_CPP = api.column(7).data();","    var col_Historical_CPM = api.column(8).data();","    var CPP = 0, CPM = 0, Historical_CPP = 0, Historical_CPM = 0;","    for(var i = 0; i < col_share.length; i++){","      CPP += (parseFloat(col_share[i])*parseFloat(col_CPP[i]).toFixed(2));","      CPM += (parseFloat(col_share[i])*parseFloat(col_CPM[i]).toFixed(2));","      Historical_CPP += (parseFloat(col_share[i])*parseFloat(col_Historical_CPP[i]).toFixed(2));","      Historical_CPM += (parseFloat(col_share[i])*parseFloat(col_Historical_CPM[i]).toFixed(2));","    }","    table.cell(rowIdx, j0+2).data((CPP/100).toFixed(2));","    table.cell(rowIdx, j0+3).data((CPM/100).toFixed(2));","    table.cell(rowIdx, j0+4).data((Historical_CPP/100).toFixed(2));","    table.cell(rowIdx, j0+5).data((Historical_CPM/100).toFixed(2));","  }","  var dataset = [];","  var n = d.length - 1;","  for(var i = 0; i < d[n].length; i++){","    var datarow = $.map(d[n][i], function (value, index) {","      return [value];","    });","    dataset.push(datarow);","  }","  var id = 'table#'+ childId;","  if (Object.keys(d[n][0]).indexOf('_details') === -1) {","    var subtable = $(id).DataTable({","'data': dataset,","'autoWidth': true,","'deferRender': true,","'info': false,","'lengthChange': false,","'ordering': d[n].length > 1,","'order': [],","'paging': true,","'scrollX': false,","'scrollY': false,","'searching': false,","'sortClasses': false,","'pageLength': 50,","'rowCallback': rowCallback,","'headerCallback': headerCallback,","'footerCallback': footerCallback,","'columnDefs': [{targets: '_all', className: 'dt-center'}]","               });","  } else {","    var subtable = $(id).DataTable({","'data': dataset,","'autoWidth': true,","'deferRender': true,","'info': false,","'lengthChange': false,","'ordering': d[n].length > 1,","'order': [],","'paging': true,","'scrollX': false,","'scrollY': false,","'searching': false,","'sortClasses': false,","'pageLength': 50,","'rowCallback': rowCallback,","'headerCallback': headerCallback,","'footerCallback': footerCallback,","'columnDefs': [","              {targets: -1, visible: false},","              {targets: 0, orderable: false, className: 'details-control'},","              {targets: '_all', className: 'dt-center'}","             ]","          }).column(0).nodes().to$().css({cursor: 'pointer'});","  }","  subtable.MakeCellsEditable({","    onUpdate: onUpdate,","    inputCss: 'my-input-class',","    columns: [2, 7, 8],","    confirmationButton: {","      confirmCss: 'my-confirm-class',","      cancelCss: 'my-cancel-class'","    }","  });","};","","// display the child table on click","var children = [];", # array to store the id's of the already created child tables"table.on('click', 'td.details-control', function(){","  var tbl = $(this).closest('table'),","      tblId = tbl.attr('id'),","      td = $(this),","      row = $(tbl).DataTable().row(td.closest('tr')),","      rowIdx = row.index();","  if(row.child.isShown()){","    row.child.hide();","    td.html('&oplus;');","  } else {","    var childId = tblId +'-child-'+ rowIdx;","    if(children.indexOf(childId) === -1){", # this child table has not been created yet"      children.push(childId);","      row.child(format(row.data(), childId)).show();","      td.html('&CircleMinus;');","      format_datatable(row.data(), childId, rowIdx);","    }else{","      row.child(true);","      td.html('&CircleMinus;');","    }","  }","});")# Module to create the nested structure of the tableNestedData <- function(dat, children){  stopifnot(length(children) == nrow(dat))  g <- function(d){    if(is.data.frame(d)){      purrr::transpose(d)    }else{      purrr::transpose(NestedData(d[[1]], children = d$children))    }  }  subdats <- lapply(children, g)  oplus <- sapply(subdats, function(x) if(length(x)) "&oplus;" else "")  cbind("" = oplus, dat, "_details" = I(subdats), stringsAsFactors = FALSE)}# shiny    # Bind the market level and mix breakout data together for the final table    market_mix_table <- reactive({      markets <- market_costings_gross_net()      mix_breakout <- mix_breakout_digital_elements()      # Make the dataframe      # This must be met length(children) == nrow(dat)      Dat <- NestedData(        dat = markets,        children = split(mix_breakout, mix_breakout$Market)      )      return(Dat)    })    # Render the table    output$daypartTable <- DT::renderDataTable({      Server = FALSE      # Whether to show row names (set TRUE or FALSE)      rowNames <- FALSE      colIdx <- as.integer(rowNames)      # The data      Dat <- market_mix_table()      # Table      table <- DT::datatable(        Dat,        callback = callback_js,        rownames = rowNames,        escape = -colIdx-1,        style = "bootstrap4",          options = list(            lengthMenu = list(              c(-1, 10, 20),              c("All", 10, 20)            ),            columnDefs = list(              list(width = '30px'),              list(width = '100px', targets = 1),              list(visible = FALSE, targets = ncol(Dat)-1+colIdx),              list(orderable = FALSE, className = 'details-control', targets = colIdx),              list(className = "dt-center", targets = "_all")            )          )        )      # Some faancy Java magic      path <- getwd()      # Call the html tools deps (js & css files in this directory)      dep <- htmltools::htmlDependency("CellEdit", "1.0.19", path,        script = "dataTables.cellEdit.js", stylesheet = "dataTables.cellEdit.css")      table$dependencies <- c(table$dependencies, list(dep))      # server = FALSE      return(table)    })

R: possible truncation of >= 4GB file

$
0
0

I have a 370MB zip file and the content is a 4.2GB csv file.

I did:

unzip("year2015.zip", exdir = "csv_folder")

And I got this message:

1: In unzip("year2015.zip", exdir = "csv_folder") :  possible truncation of >= 4GB file

Have you experienced that before? How did you solve it?

Colored Points Disappearing When Creating an Animation in R using Plotly

$
0
0

I'm trying to create an animation in R using plotly where a point can change colors based on its state at the current time. Each point can transition from one phase to another at any time, and I think plotly is losing track of the point transitions due to the changing of colors.

In the code below, you'll see that the animation is smooth without any colors. But when the colors are included in the plot, the points start misbehaving during the animation - either they get stuck, they disappear, or they jump all over the plot.

Do you have any suggestions to fix this, or perhaps an alternate way to animate a shifting-state in R?

library(plotly)plot_colors <- c(red = "red", gray = "gray")df <-   data.frame(step = rep(1:10, 2),             id   = c(rep(1, 10), rep(2, 10)),             x    = c(1:10, 1:10),             y    = c(1:10, 10:1),             col  = sample(c("red", "gray"), 20, replace = TRUE)) # This animation is smooth  df %>% plot_ly(x = ~x, y = ~y, frame = ~step, mode = "markers", type = "scatter")# This animation has the points jumping all over the screen, and points disappeardf %>% plot_ly(x = ~x, y = ~y, frame = ~step, color = ~col,                colors = plot_colors, mode = "markers", type = "scatter")

Increase end argument in str_sub up until we get unique values in dataframe

$
0
0

Here is my toy data

df <- tibble::tribble(        ~names,"apple alpha","grapes beta","orange gamma",'graphite alpha',"a bc","apple gamma")

Here a sample code:

df_1 <- df %>%   mutate(first_word = word(names),         first_n_letters = str_sub(names, start = 1L, end = 4L) %>% str_trim(),         small_name = if_else(nchar(first_word) > nchar(first_n_letters), first_word, first_n_letters))

I want to keep increasing the end argument of str_sub in the above code up until the point when

n_distinct(df_1 %>% select(small_name)) == nrow(df_1)

In other words, the nchar(small_name) should be just big enough so that there are no duplicate small_name in the dataframe.

So, in this case, the iterator in the loop should stop at 7 as that will provide unique values of small_name in the df_1.

How to pivot_long pairs of date-var combinations with non-matching dates?

$
0
0

Here is my toy data.

    df <- tibble::tribble(    ~date1,      ~A Equity,  ~date2,          ~B Equity, ~date3,     ~C Equity,"1/29/2016",        35,  "10/31/2017",     67,       NA_character_,  NA_real_,"2/29/2016",        40,  "11/30/2017",     31,       NA_character_,  NA_real_,    NA_character_,NA_real_,  "12/29/2017",     56,       NA_character_,  NA_real_)

The real one has over 1000 columns and many more dates.

I want to long the data so that the desired output has only date, var, and value columns as shown below:

desired_df <- tibble::tribble(         ~date,   ~var,  ~value,"1/29/2016",  "A",      35,"2/29/2016",  "A",      40,"10/31/2017",  "B",      67,"11/30/2017",  "B",      31,"12/29/2017",  "B",      56)

I tried this, but am not getting the desired result:

df2 <- df %>%   pivot_longer(cols = contains("date"), names_to = "dates", values_to = "date") %>%   pivot_longer (cols = contains("Equity"), names_to = "var", values_to = "value") %>%   select(-dates) %>%   distinct() %>%   filter(!is.na(date))

Exporting Caret Resamples Results Summary to dataframe/csv

$
0
0

I have run several train objects and collected the resampling results via the resamples() function in caret. However, I would like to export those results to a data frame or ideally to a csv directly. Is there any simple way of doing this?

Thanks!

R Error message "Error in open.connection(con, "rb") : HTTP error 404"

$
0
0

I am using the package parlitools in R. This basically downloads UK election data so you can plot on a Map. I have a lot of code that does not now work as it can't download the data. The error message I get is "Error in open.connection(con, "rb") : HTTP error 404".

Does anyone know why I am now getting this error message. I am using the same computer as before. Alternatively, could someone try downloading the data in R to see if it is my internet connection or another issue potentially?

The following packages are used.
library(leaflet)library(sf)library(htmlwidgets)library(dplyr)library(parlitools)library(ggplot2)

This section of code should download the data needed with no error messages.

`west_hex_map <- parlitools::west_hex_map

party_col <- parlitools::party_colour

mps <- mps_on_date("2019-12-13")

mps_colours <- left_join(mps, party_col, by = "party_id")

Join to current MP data

west_hex_map <- left_join(west_hex_map, mps_colours, by = "gss_code")

Join colours to hexagon map`

Thanks for reading, James.

how can I translate c# binary files reading into R?

$
0
0

I have to translate a project from c# to R. In this c# project i have to handle binary files.I have three problems:

1.I am having some issues to convert this code:

//c#     //this work fineusing (BinaryReader rb = new BinaryReader(archive.Entries[0].Open())){   a = rb.ReadInt32();   b = rb.ReadInt32();   c = rb.ReadDouble();}#R#this work, but it reads different values#I tried to change the size in ReadBin, but it's the same story. The working diretory is the right oneto.read <- "myBinaryFile.tmp"line1<-c(readBin(to.read,"integer",2),          readBin(to.read,"double",1)) 
  1. How can I read float (in c# i have rb.ReadSingle()) in R?
  2. Is there in R a function to memorize the position that you have arrived when you are reading a binary file? So next time you will read it again, you could skip what you have already read (as in c# with BinaryReader)

Create in R a dataframe column that indicates how many times each row value has been sampled

$
0
0

I am using R to draw a random sample of 24 elements (species) out of a vector containing 36 elements. As I have to repeat this process several times, I would like to create a data frame in which the first column is the name of the species and the second is a counter of the number of times in which the species has been sampled. So, at the end of the first sampling i would have something like:

   Plot_1                      Freq   Agrostis castellana         1   Amaranthus hybridus         1   Ambrosia artemisiifolia     1   Bromus secalinus            0   ...                         ...

Then, after the second extraction I would have

   Plot_1                      Freq   Agrostis castellana         2      #extracted both in Plot1 and Plot2   Amaranthus hybridus         1      #extracted only in either Plot1 or Plot2   Ambrosia artemisiifolia     1   Bromus secalinus            0      #not extracted yet   ...                         ...

I've looked in the internet but I can't seem to find a solution :( Please help! Thank you very much!!

Problem installing packages in R "unable to create temporary directory"

$
0
0

I am having problems installing packages in Rstudio, I tried to install "nortest" among others and it keeps saying the following error:

> install.packages("nortest")Installing package into ‘C:/Users/meliv/Documents/R/win-library/3.6’(as ‘lib’ is unspecified)Warning in install.packages :'lib = "C:/Users/meliv/Documents/R/win-library/3.6"' is not writableprobando la URL 'https://cran.rstudio.com/bin/windows/contrib/3.6/nortest_1.0-4.zip'Content type 'application/zip' length 39063 bytes (38 KB)downloaded 38 KBWarning in install.packages :cannot create dir 'C:\Users\meliv\Documents\R\win-library\3.6\file408c43245306', reason 'No such file or directory'Error in install.packages : unable to create temporary directory ‘C:\Users\meliv\Documents\R\win-library\3.6\file408c43245306’

I tried unchecking the folder's "read-only" feature but made no difference.

Hope someone can help me. Thanks

Ps. I have Windows 10

How do I incorporate SE in place of SD in my bar chart error bars? Also, how do I change the order of my x-axis groups

$
0
0

I have created a bar chart displaying proportion of time spent on different behaviours for groups of lemurs. However I am placed with two problems.

1) I had hoped to use standard error bars in place of my standard deviation bars. I am unsure in how to incorporate it into my existing code.My current ggplot output is as follows:

 data_summary <- function(data, varname, groupnames){  require(plyr)  summary_func <- function(x, col){    c(mean = mean(x[[col]], na.rm=TRUE),      sd = sd(x[[col]], na.rm=TRUE),)  }  data_sum<-ddply(data, groupnames, .fun=summary_func,                  varname)  data_sum <- rename(data_sum, c("mean" = varname))  return(data_sum)}df4 <- data_summary(mydata_bc, varname="Time",                     groupnames=c("Group", "Behaviour"))p <- ggplot(df4, aes(x=Behaviour, y=Time, fill=Group)) +  geom_bar(stat="identity", position=position_dodge()) +  geom_errorbar(aes(ymin=Time-sd, ymax=Time+sd), width=.2,                position=position_dodge(0.9))

2) I also had hoped to change the order of my behaviours on the x axis.

Any help would be greatly appreciated.

Current bar chartCurrent bar chart

My csv data: https://drive.google.com/file/d/1UWJoluv3MWwXoQg2zcDORDJiWuIA8j4f/view?usp=sharing

Add Table Caption to RHandsontable in R

$
0
0

I have the following dataframe in R

  library(rhandsontable)  library(data.table) ###LIBRARIES IMPORTED

We next create the dataset

  DF = data.frame(val = 1:10, big = LETTERS[1:10],                            small = letters[1:10],                         stringsAsFactors = FALSE)

We now render a datatable with a caption

  datatable(DF,caption = htmltools::tags$caption(   style = 'caption-side: top; text-align: center;','Table 1: ', htmltools::em('Trial Data')))

Can a similar table caption be created using RHandsontable

  rhandsontable(DF, colHeaders = c("A", "B", "C")) %>%  hot_cols(colWidths = 75)

I Request someone to take a look.


Reorganize data frame elements depending on the content of the rows in R

$
0
0

I have this dataset:

df <- structure(list(V1 = c("B1D01", "B1D01", "B1D01", "B1D01", "B1D01", "B1D01", "U0155"), V2 = c("U0155", "U0155", "U0155", "U0155", "U0155", "U0155", "U3003"), V3 = c("U3003", "U3003", "C1B00", "U3003", "U3003", "U3003", "C1B00"), V4 = c("C1B00", "C1B00", "U0073", "C1B00", "C1B00", "C1B00", "P037D"), V5 = c("P037D", "P037D", NA, "P037D", "P037D", "P037D", "P0616"), V6 = c("P0616", "P0616", NA, "P0616", "P0616", "P0616", "P0562"), V7 = c("P0562", "P0562", NA, "P0562", "P0562", "P0562", "U0073"), V8 = c("U0073", "U0073", NA, "U0073", "U0073", "U0073", NA)), .Names = c("V1", "V2", "V3", "V4", "V5", "V6", "V7", "V8"), row.names = 1719:1725, class = "data.frame")

When I print(df):

        V1    V2    V3    V4    V5    V6    V7    V81719 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U00731720 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U00731721 B1D01 U0155 C1B00 U0073  <NA>  <NA>  <NA>  <NA>1722 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U00731723 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U00731724 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U00731725 U0155 U3003 C1B00 P037D P0616 P0562 U0073  <NA>

As you can observe, there is a mix in these codes. For instance, U3003 is primarily in V3, but it can also be shown in V2 (last row).

I would like to reorganize this data frame with these conditions:

  • Each code might be placed in one column.
  • Names of the column should be the name of the codes.
  • If there are more codes than 8 columns, number of columns might reflect number of codes.
  • The cell values might keep the name of the codes.
  • If the code is not present in a row, NA must appear.

Be aware that my original data frame contains much more rows than this small example extracted from the original.

How to identify subnetworks in adjacency matrix in R?

$
0
0

Assume I have a network graph "G" based on the following edges:

edges <- data.frame(   from=c(1,1,4,4,4,5,5,6),  to=  c(2,3,5,6,7,6,7,7))G <- graph_from_data_frame(d=edges,  directed=F) 

This example clearly contains 2 subnetworks, the first with nodes 1,2,3 and the second one with nodes 4,5,6,7. I would like to:

  1. Identify to which subnetwork node "i" belongs to.
  2. The number of nodes in each subnetwork.

Thus, in this example, the function will ideally create an object with as many rows as number of nodes in G, and two columns: the first contains a vector that indicatse the ID of the subnetwork and the second with the size (gsize) of the subnetwork. .

 result <- data.frame(   ID=c(1,1,2,2,2,2,2,2),  gsize=c(3,3,3,4,4,4,4,4))G <- graph_from_data_frame(d=edges,  directed=F) 

I am new using igraphs so maybe there is a function to do this. Thanks very much !!

Problems storing reactives in shiny R session$userData / scoping issue?

$
0
0

I am trying to assign a value inside a . I have two use cases and one seems to work "mod1" and the other "mod2" is not working

library(shiny)mod1UI = function(id){    ns = NS(id)    uiOutput(ns("selector"))}mod1 = function(input, output, session){    ns = session$ns     output$selector <- renderUI({selectInput(        ns("selected_table"),        label = "View Data",        choices = list(one = "one", two = "two"),        selected = "one"    )    })    session$userData$dataChoice = reactive({req(input$selected_table) })}mod2UI = function(id){    ns = NS(id)    DT::dataTableOutput(ns("mainTable_choice2"))}mod2 = function(input, output, session){    ns = session$ns    data <- reactive({        choice = session$userData$dataChoice()        #print(choice)        if(choice == "one"){            dta = iris        }else if(choice == "two"){            dta = Orange        }    })    output$mainTable_choice2 = DT::renderDataTable({        options = list(autoWidth = TRUE, searching = TRUE)        #browser()        data_ = data()        datatable(data_)    })    reactive({        data = data()        choice = session$userData$dataChoice()        if(choice == "one"){            session$userData$Var1 <- reactive({req(data[input$mainTable_choice2_rows_selected,1])})        }    })}# Define UI for applicationui <- fluidPage(    # Application title    titlePanel("Minimum Example"),    # Sidebar with a slider input for number of bins     sidebarLayout(        sidebarPanel(            mod1UI("mod1")        ),        mainPanel(            mod2UI("mod2")        )    ))server <- function(input, output, session) {    callModule(mod1, "mod1")    callModule(mod2, "mod2")    #swap Var1 for dataChoice and it works !!!    observeEvent(session$userData$Var1(), {showModal(modalDialog(        title = "Are you sure",        textInput("text",label = "test" , value = "testHL02"),        easyClose = TRUE,        footer = tagList(            modalButton("Cancel"),            actionButton("delete", "Delete")        )))})}# Run the application shinyApp(ui = ui, server = server)

I can access session$userData$dataChoice() inside an observer and a reactive in another module and obtain a result.

I cant call session$userData$Var1() in another observer. Instead I get the following error:

Error in observeEventExpr: attempt to apply non-function

This is exemplified when swapping dataChoice with Var1 in the observeEvent in the server function.

Does anyone know why?

ThanksJP

heatmaply: ajust the colour

$
0
0

I am using heatmaply to visualise single cell expression data.
I want to use custom colours and to assign "yellow" to expression levels > 2 , "magenta" to expression levels < -2. The range between [-2,2] should be represented by smooth spectre "magenta"-"black"-"yellow".

Here, my sample file is a matrix 10 rows x 5 columns:Matrix

my_palette<-PurpleAndYellow(50) heatmaply(test,col = my_palette, dendrogram = "none")

It produces correct "PurpleAndYellow" colour heatmap, however it scaled over whole range [-5, +5]

Heatmap1

If I specify scale_fill_gradient_fun, the colour of heatmap becomes "RdBu" spectre (Heatmap2):

heatmaply(ab, col = my_palette, dendroram="none",          scale_fill_gradient_fun = ggplot2::scale_fill_gradient2(            low = "magenta", mid="black",high = "yellow", midpoint = 0,             limits = c(-2, 2)))

Heatmap2

Could you let me know how to change colours and adjust brightness of heatmap using heatmaply ?

R 3.5.2: Pipe inside custom function - object 'column' not found

$
0
0

I am having issues with pipes inside a custom function. Based on the previous posts, I understand that a pipe inside a function creates another level(?) which results in the error I'm getting (see below).

I'm hoping to write a summary function for a large data set with hundreds of numeric and categorical variables. I would like to have the option to use this on different data frames (with similar structure), always group by a certain factor variable and get summaries for multiple columns.

library(tidyverse)data(iris)iris %>% group_by(Species) %>% summarise(count = n(), mean = mean(Sepal.Length, na.rm = T))# A tibble: 3 x 3  Species    count  mean<fct>      <int> <dbl>1 setosa        50  5.012 versicolor    50  5.943 virginica     50  6.59

I'm hoping to create a function like this:

sum_cols <- function (df, col) { df %>% group_by(Species) %>% summarise(count = n(), mean = mean(col, na.rm = T)) }

And this is the error I'm getting:

sum_cols(iris, Sepal.Length)Error in mean(col, na.rm = T) : object 'Petal.Width' not foundCalled from: mean(col, na.rm = T)

I have had this problem for a while and even though I tried to get answers in a few previous posts, I haven't quite grasped why the problem occurs and how to get around it.

Any help would be greatly appreciated, thanks!

Viewing all 206673 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>