I have a large sparse matrix which I need to correlate which haven't been possible for me because:
- I cant convert the sparse matrix to a dense matrix due to R's memory limitation
- I tried using packages
bigstats
andbigmemory
, my R froze over (using a windows 10, 8GB laptop) - There's no correlation function in R's
Matrix
package
Now:
I want to ask if it's possible to split a sparse matrix into two or three parts, convert to dense matrices then correlate each dense matrix then cbind the two or three dense matrices into one then export to a text file.
What function can I use to split a sparse matrix into two or three bearing in mind that both the i
and p
parts of the sparse matrix are equal sizes with the same dim
Formal class 'dgCMatrix' [package "Matrix"] with 7 slots
..@ i : int [1:73075722] ...
..@ p : int [1:73075722] 0 0 1 1 1 1 1 2 2 2 ...
..@ Dim : int [1:2] 500232 500232
..@ Dimnames:List of 2
.. ..$ : NULL
.. ..$ : NULL
..@ x : num [1:73075722] ...
..@ uplo : chr "L"
..@ factors : list()
The correlation output will be in this format:
[,1] [,2] [,3] [,4]
[1,] 1.00000000 -0.8343860 0.3612926 0.09678096
[2,] -0.83438600 1.0000000 -0.8154071 0.24611830
[3,] 0.36129256 -0.8154071 1.0000000 -0.51801346
[4,] 0.09678096 0.2461183 -0.5180135 1.00000000
[5,] 0.67411584 -0.3560782 -0.1056124 0.60987601
[6,] 0.23071712 -0.4457467 0.5117711 0.21848068
[7,] 0.49200080 -0.4246502 0.2016633 0.46971736
[,5] [,6] [,7]
[1,] 0.6741158 0.2307171 0.4920008
[2,] -0.3560782 -0.4457467 -0.4246502
[3,] -0.1056124 0.5117711 0.2016633
[4,] 0.6098760 0.2184807 0.4697174
[5,] 1.0000000 0.2007979 0.7198228
[6,] 0.2007979 1.0000000 0.6965899
[7,] 0.7198228 0.6965899 1.0000000