The tidyr::unnest
method from the R language as an equivalent in pandas and it is called explode
as explained in this very detailed answer.
I would like to know if there is an equivalent to the ̀tidyr::nest` method.
Example R code:
library(tidyr)
iris_nested <- as_tibble(iris) %>% nest(data=-Species)
The data column is a list-column, which contains data frames (this is useful for modelling for example, when running many models).
iris_nested
# A tibble: 3 x 2
Species data
<fct> <list<df[,4]>>
1 setosa [50 × 4]
2 versicolor [50 × 4]
3 virginica [50 × 4]
To access one element inside the data column:
iris_nested[1,'data'][[1]]
[...]
# A tibble: 50 x 4
Sepal.Length Sepal.Width Petal.Length Petal.Width
<dbl> <dbl> <dbl> <dbl>
1 5.1 3.5 1.4 0.2
2 4.9 3 1.4 0.2
3 4.7 3.2 1.3 0.2
4 4.6 3.1 1.5 0.2
5 5 3.6 1.4 0.2
6 5.4 3.9 1.7 0.4
7 4.6 3.4 1.4 0.3
8 5 3.4 1.5 0.2
9 4.4 2.9 1.4 0.2
10 4.9 3.1 1.5 0.1
# … with 40 more rows
library(tidyr)
iris_nested <- as_tibble(iris) %>% nest(data=-Species)
iris_nested
iris_nested[1,'data'][[1]]
Example python code:
from sklearn import datasets
iris = datasets.load_iris()
How can I nest this data frame in pandas :
- firstly in a less complex way (on paar with the pandas explode functionality) the data column contains a simple list
- secondly the data column contains data frames as illustrated in the example above