The problem
I am trying to wrap old C code into an R package.
So far, I have managed to get RStudio to compile the code, export the function, and everything in the function's body executes properly right until the end of the execution.
Then, the Rsession aborts without an error message (other than the restart session buttons).
My question is: why? and how do i fix it?
EDIT: i found a way to make it work, it's at the end of this post, but I am still without answers to both questions.
Code bits
The wrapper looks like this:
#' @useDynLib mypackage main_
#' @export
cellid <- function(args) {
# Example input:
# args <- "cell -p ~/Projects/Colman/HD/scripts/cellMagick/data/images/parameters.txt -b /tmp/Rtmp7fjlFo/file2b401093d715 -f /tmp/Rtmp7fjlFo/file2b402742f6ef -o ~/Projects/Colman/HD/uscope/20200130_screen_act1_yfp/1/Position001/out"
argv <- strsplit(args, "")[[1]] # Split arguments
argc <- length(argv) # Get length
.C(main_, as.integer(argc), as.character(argv))
}
The actual function looks like:
int main_(int* aargc, char* argv[]){
... DO A LOT OF STUFF ...
return 1;
}
What I've tried
As I said before, if I go looking for the effects of the function on the expected system files, i find everything there. The only way I have to prevent the Rsession from crashing is to raise an error before the return:
int main_(int* argc, char* argv[]){
... DO A LOT OF STUFF ...
error("everything works up to this point");
return 1;
}
I have tried many things, quite blindly. For example, changing int to void at the function definition and omitting the return statement does not help.
I used lldb as suggested here, and out of it got the following:
Process 7900 stopped
* thread #1, name = 'R', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
frame #0: 0x00007ffff7a946f5 libc.so.6`__strlen_avx2 + 21
libc.so.6`__strlen_avx2:
-> 0x7ffff7a946f5 <+21>: vpcmpeqb (%rdi), %ymm0, %ymm1
0x7ffff7a946f9 <+25>: vpmovmskb %ymm1, %eax
0x7ffff7a946fd <+29>: testl %eax, %eax
0x7ffff7a946ff <+31>: jne 0x7ffff7a947f0 ; <+272>
Which I am not able to interpret, my C skills are basically zero.
Narrowing down the cause
I have only been able to narrow down the problem to that part in which the C function ends, and R must continue (I guess).
Perhaps the problem is in the pointer objects (argv
and argc
). I have noticed that argc
and argv
change their contents throughout the execution of the function (i.e. they are shortened, see below).
At the beginning of the function:
Input argument number (argc[0]): 9
Input argument (argv[i] for printf): cell -p ~/Projects/Colman/HD/scripts/cellMagick/data/images/parameters.txt -b /tmp/Rtmplr4Q90/file26f12a94a144 -f /tmp/Rtmplr4Q90/file26f175cd02e5 -o ~/Projects/Colman/HD/uscope/20200130_screen_act1_yfp/1/Position001/out
At the end of the function:
Input argument number (argc[0]): 1
Input argument (argv[i] for printf): cell
Though i am not sure if this can be a problem, this side-effect is caused normally by glib's g_option_context_parse parsing of arguments.
Suspiciously for me, commenting everything after the g_option_context_parse
line still causes the segfault, but commenting that line and everything after it does not cause the segfault.
Is R expecting to find argc and argv unchanged? Would it crash when it doesn't?
Unfortunately this is as far as I have been able to get on my own.
I have tried to keep this question short, but the complete source of the C program is available (main
is defined at cell.c
).
I'd appreciate your help. Let me know how I can improve my question, if it would help.
EDIT: a hacky solution
I found that g_option_context_parse
changes argv
in a way that when printed it shows null values in all positions > 1:
cell (null) (null) (null) (null) (null) (null) (null) (null)
This not in conflict with glib's documentation.
But, if during the printing loop at the end of the script I assign non-null stuff to each position in the array (i.e. argv[i] = "";
) the Rsession no longer aborts after the function finishes.
Even though this does "make it work", I don't really know why this happens, and my fix much more of a hack than what I'm comfortable with.
I'll leave it open for now, hoping someone will provide a good explanation and a proper solution.