Day Time Numbers
6388 2017-02-01 10:43 R33
7129 2017-02-04 15:32 N39.0, N39.0, N39.0
9689 2017-02-17 08:54 S72.11, S72.11, S72.11, S72.11
6703 2017-02-02 18:55 R11
9026 2017-02-13 17:34 S06.0, S06.0, S06.0
5013 2017-01-25 00:33 J18.1, J18.1, J18.1, J18.1
5849 2017-01-29 17:57 I21.4, I21.4, I21.4
9245 2017-02-14 19:03 J18.0, J18.0, J18.0
1978 2017-01-09 21:23 K59.0
5021 2017-01-25 02:46 I47.1, I47.1, I47.1
9258 2017-02-14 20:19 S42.3
541 2017-01-03 11:44 I63.8, I63.8, I63.8
4207 2017-01-20 19:52 E83.58, E83.58, E83.58
8650 2017-02-11 18:39 R55, R55, S06.0, S06.0, R55
9442 2017-02-15 21:30 K86.1
4186 2017-01-20 18:27 S05.1
4231 2017-01-20 22:10 M17.9
6847 2017-02-03 11:45 L02.4
1739 2017-01-08 21:19 S20.2
3685 2017-01-18 09:56 G40.9
9497 2017-02-16 09:52 S83.6
2563 2017-01-12 20:47 M13.16, M25.56, M25.56
9731 2017-02-17 13:10 B99, B99, N39.0, N39.0
7759 2017-02-07 14:25 R51, G43.0, G43.0
368 2017-01-02 15:05 T83.0, T83.0, T83.0, N13.3, N13.6
I want to aggregate this df in a special way. I want to count how many Numbers starting e.g. "A" on each day. I want a new dataframe that looks like this:
Day GroupA GroupB GroupC .....
1 2017-01-01 2 2 0
2 2017-01-02 ..................
GroupA means Numbers starting with A. If there are multiple numbers starting with A in one single row it count be counted as one. The class of my number-column is character.
> class(df[1,3])
[1] "character"> df[1,3]
[1] "A41.8, A41.51, A41.51"**
My problem is how I can combine the aggregate-command with the counts. My real df is a lot bigger, it is more than 2 years long, so I would need an automatized solution.
EDIT: See data down below
structure(list(Day= c("2017-01-07", "2017-01-23", "2017-01-08",
"2017-01-13", "2017-02-10", "2017-01-07", "2017-01-24", "2017-01-02",
"2017-01-03", "2017-01-06", "2017-01-11", "2017-01-21", "2017-01-13",
"2017-01-10", "2017-02-18", "2017-01-10", "2017-01-31", "2017-01-27",
"2017-01-23", "2017-01-13", "2017-02-10", "2017-01-09", "2017-01-23",
"2017-01-09", "2017-01-08"), Time= c("02:02", "14:51", "02:12",
"17:49", "00:00", "21:30", "22:28", "17:27", "12:14", "22:52",
"14:19", "11:40", "19:33", "04:01", "15:59", "14:57", "08:34",
"13:21", "02:01", "14:29", "20:17", "14:30", "02:34", "04:56",
"14:34"), Number= c("H10.9", "K85.80, K85.20, K85.80, K85.20",
"R09.1", "I10.90", "I48.9, I48.0, I48.9, I48.0", "A09.0, A09.0, R42, R42",
"H16.1", "K92.1, K92.1, K92.1", "K40.90, J12.2, J18.0, J96.01, J12.2",
"B99, J15.8, J18.0, J15.8", "S01.55", "M21.33", "I10.01, I10.01, J44.81, J44.81",
"S00.95", "B08.2", "S05.1", "M20.1", "G40.2, S93.40, S93.40",
"M25.51", "J44.19, J44.11, J44.19, J44.11", "G40.9, G40.2, G40.2",
"E87.1, E87.1, J18.0, J18.0", "I10.91", "R22.0", "S06.5, S06.5, S06.5, R06.88, S12.22"
)), .Names = c("Day", "Time", "Number"), row.names = c(1336L,
4687L, 1536L, 2737L, 8272L, 1507L, 4994L, 400L, 550L, 1305L,
2325L, 4292L, 2748L, 2008L, 9974L, 2113L, 6144L, 5433L, 4577L,
2697L, 8468L, 1883L, 4578L, 1783L, 1657L), class = "data.frame")