Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

R Beginner struggling with extremely messy XLSX

$
0
0

I got an XLSX with data from a questionnaire for my master thesis. The questions and answers for an interviewee are in one row in the second column. The first column contains the date.

The data of the second column comes in a form like this:

"age":"52","height":"170","Gender":"Female",...and so on

I started with:

test12 <- read_xlsx("Testdaten.xlsx")

library(splitstackshape)test13 <- concat.split(data = test12, split.col= "age", sep =",")

Then I got the questions and the answers as a column divided by a ":". For e.g. column 1: "age":"52" and column2:"height":"170". But the data is so messy that sometimes in the column of the age question and answer there is a height question and answer and for some questionnaires questions and answers double.

I would need the questions as variables and the answers as observations. But I have no clue how to get there. I could clean the data in excel first, but with the fact that columns are not constant and there are for e.g. some height questions in the age column I see no chance to do it as I will get new data regularly, formated the same way.

Here is an example of the data: A tibble: 5 x 2 partner.createdAt partner.wphg.info
<chr> <chr>
1 2019-11-09T12:13:11.099Z "{\"age_years\":\"50\",\"job_des\":\"unemployed\",\"height_cm\":\"170\",\"Gender\":\"female\",\"born_in\":\"Italy\",\"Alcoholic\":\"false\",\"knowledge_selfass\":\"5\",\"total_wealth\":\"200000\""
2 2019-11-01T06:43:22.581Z "{\"age_years\":\"34\",\"job_des\":\"self-employed\",\"height_cm\":\"158\",\"Gender\":\"male\",\"born_in\":\"Germany\",\"Alcoholic\":\"true\",\"knowledge_selfass\":\"3\",\"total_wealth\":\"10000\""
3 2019-11-10T07:59:46.136Z "{\"age_years\":\"24\",\"height_cm\":\"187\",\"Gender\":\"male\",\"born_in\":\"England\",\"Alcoholic\":\"false\",\"knowledge_selfass\":\"3\",\"total_wealth\":\"150000\""
4 2019-11-11T13:01:48.488Z "{\"age_years\":\"59\",\"job_des\":\"employed\",\"height_cm\":\"167\",\"Gender\":\"female\",\"born_in\":\"United States\",\"Alcoholic\":\"false\",\"knowledge_selfass\":\"2\",\"total_wealth\":\"1000000~ 5 2019-11-08T14:54:26.654Z "{\"age_years\":\"36\",\"height_cm\":\"180\",\"born_in\":\"Germany\",\"Alcoholic\":\"false\",\"knowledge_selfass\":\"5\",\"total_wealth\":\"170000\",\"job_des\":\"employed\",\"Gender\":\"male\""

Thank you so much for your time!


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>