I'm a complete beginner, and for a project for college I need to analyse film scripts. I want to create a table in which I can match the characters to their lines. My files are all in .txt format and I'd like to convert them to a csv file. I have a lot of scripts to go through, so I'd like to find a code that can be easily adapted to the different files.
This is what I have:
THREEPIO
Did you hear that? They've shut
down the main reactor. We'll be
destroyed for sure. This is
madness!
THREEPIO
We're doomed!
THREEPIO
There'll be no escape for the
Princess this time.
THREEPIO
What's that?
And this is what I need to have:
"character""dialogue"
"1""THREEPIO""Did you hear that? They've shut down the main reactor. We'll be destroyed for sure. This is madness!"
"2""THREEPIO""We're doomed!"
"3""THREEPIO""There'll be no escape for the Princess this time."
"4""THREEPIO""What's that?"
This is what I've tried:
# the first 70 lines don't contain dialogues
# so we can start reading at line 70 (for instance)
i = 70
# while loop to extract character and dialogues
# (probably there's a better way to parse the file instead of
# using my crazy nested if-then-elses, but this works for me)
while (i <= nlines)
{
# if empty line
if (sw[i] == "") i = i + 1 # next line
# if text line
if (sw[i] != "")
{
# if uninteresting stuff
if (substr(sw[i], 1, 1) != "") {
i = i + 1 # next line
} else {
if (nchar(sw[i]) < 10) {
i = i + 1 # next line
} else {
if (substr(sw[i], 1, 5) != ""&& substr(sw[i], 6, 6) != "") {
i = i + 1 # next line
} else {
# if character name
if (substr(sw[i], 1, 30) == b30)
{
if (substr(sw[i], 31, 31) != "")
{
tmp_name = substr(sw[i], 31, nchar(sw[i], "bytes"))
cat("\n", file="EpisodeVI_dialogues.txt", append=TRUE)
cat(tmp_name, "", file="EpisodeVI_dialogues.txt", sep="\t", append=TRUE)
i = i + 1
} else {
i = i + 1
}
} else {
# if dialogue
if (substr(sw[i], 1, 15) == b15)
{
if (substr(sw[i], 16, 16) != "")
{
tmp_diag = substr(sw[i], 16, nchar(sw[i], "bytes"))
cat("", tmp_diag, file="EpisodeVI_dialogues.txt", append=TRUE)
i = i + 1
} else {
i = i + 1
}
}
}
}
}
}
}
}
Any help would me much appreciated! Thank you!!