I am trying to convert a nested list (video_details_t) into a data frame. Most of the information in the nested list shouldn't be in the final data frame, just "tags" (and ideally "id"). The nested list has 252 elements and each element is structured like so:
$ :List of 4
..$ kind : chr "youtube#videoListResponse"
..$ etag : chr "\"Fznwjl6JEQdo1MGvHOGaz_YanRU/wjb97SA5L1u9pjKF_Wa4GYuJoks\""
..$ pageInfo:List of 2
.. ..$ totalResults : int 1
.. ..$ resultsPerPage: int 1
..$ items :List of 1
.. ..$ :List of 4
.. .. ..$ kind : chr "youtube#video"
.. .. ..$ etag : chr "\"Fznwjl6JEQdo1MGvHOGaz_YanRU/fJEMmhh4c330M-HX-dZXcMUN_R0\""
.. .. ..$ id : chr "Dod4hirL4IU"
.. .. ..$ snippet:List of 10
.. .. .. ..$ publishedAt : chr "2019-11-02T13:00:04.000Z"
.. .. .. ..$ channelId : chr "UCa92M881KJO0FqaOUb4xAqg"
.. .. .. ..$ title : chr "Making Hydrogen from Water (Ft: The DIY Science Guy)"
.. .. .. ..$ description : chr "In which JB attempts to make an electrolytic cell for making hydrogen gas after being inspired by The DIY Scien"| __truncated__
.. .. .. ..$ thumbnails :List of 5
.. .. .. .. ..$ default :List of 3
.. .. .. .. .. ..$ url : chr "https://i.ytimg.com/vi/Dod4hirL4IU/default.jpg"
.. .. .. .. .. ..$ width : int 120
.. .. .. .. .. ..$ height: int 90
.. .. .. .. ..$ medium :List of 3
.. .. .. .. .. ..$ url : chr "https://i.ytimg.com/vi/Dod4hirL4IU/mqdefault.jpg"
.. .. .. .. .. ..$ width : int 320
.. .. .. .. .. ..$ height: int 180
.. .. .. .. ..$ high :List of 3
.. .. .. .. .. ..$ url : chr "https://i.ytimg.com/vi/Dod4hirL4IU/hqdefault.jpg"
.. .. .. .. .. ..$ width : int 480
.. .. .. .. .. ..$ height: int 360
.. .. .. .. ..$ standard:List of 3
.. .. .. .. .. ..$ url : chr "https://i.ytimg.com/vi/Dod4hirL4IU/sddefault.jpg"
.. .. .. .. .. ..$ width : int 640
.. .. .. .. .. ..$ height: int 480
.. .. .. .. ..$ maxres :List of 3
.. .. .. .. .. ..$ url : chr "https://i.ytimg.com/vi/Dod4hirL4IU/maxresdefault.jpg"
.. .. .. .. .. ..$ width : int 1280
.. .. .. .. .. ..$ height: int 720
.. .. .. ..$ channelTitle : chr "Good and Basic"
.. .. .. ..$ tags :List of 8
.. .. .. .. ..$ : chr "DIY"
.. .. .. .. ..$ : chr "diyscienceguy"
.. .. .. .. ..$ : chr "diy science guy"
.. .. .. .. ..$ : chr "hydrogen electrolysis"
.. .. .. .. ..$ : chr "water splitting"
.. .. .. .. ..$ : chr "hydrogen generator"
.. .. .. .. ..$ : chr "Good and basic"
.. .. .. .. ..$ : chr "splitting molecules"
.. .. .. ..$ categoryId : chr "22"
.. .. .. ..$ liveBroadcastContent: chr "none"
.. .. .. ..$ localized :List of 2
.. .. .. .. ..$ title : chr "Making Hydrogen from Water (Ft: The DIY Science Guy)"
.. .. .. .. ..$ description: chr "In which JB attempts to make an electrolytic cell for making hydrogen gas after being inspired by The DIY Scien"| __truncated__
What the final output should be is a data frame with 252 rows (one for each of the 252 elements of video_tags_t) and a column for each unique "tag" entry across all 252 elements. Here's what I've entered so far:
just_tags <- map(map(map(video_details_t, "items") %>%
flatten(), "snippet"), "tags")
This gets me a nested list with 252 elements and each element is a vector containing all the tags. So far so good. Next I use the following to convert it to a data frame:
df<- rbind_all(lapply(just_tags, data.frame))
This gives me a data frame with 2165 columns, one for every tag, exactly what I want. But the data frame only has 238 rows when it should have 252 (one for every element of just_tags). What is going on here? Is it deleting duplicate rows during the conversion?
I also get the following output:
Warning messages:
1: 'rbind_all' is deprecated.
Use 'bind_rows()' instead.
See help("Deprecated")
2: In bind_rows_(x, id = id) :
Unequal factor levels: coercing to character
3: In bind_rows_(x, id = id) :
binding character and factor vector, coercing into character vector
4: In bind_rows_(x, id = id) :
binding character and factor vector, coercing into character vector
5: In bind_rows_(x, id = id) :
binding character and factor vector, coercing into character vector
I'm assuming those don't matter for the output, since I think they're just converting the "tags" elements into characters instead of factors.
If the conversion is deleting duplicate rows, is there a way to preserve them, say, by identifying each row with the "id" element from the original list? Each of the 252 elements has exactly one "id" element and it's unique so it could be used to delineate each of the 252 final output rows in the data frame.
Thanks so much for your help and please let me know if I can make something clearer!