I need to delete several occurrences of a symbol (single quote) within comments in an XML file in R, and then save it back to XML. I have to do this in 1000s of XML file.
In each XML file the single quote appears more than 50 times and present in different child hierrachial structure (some in child and some in sub-child). But always present within the comments.
I Tried using XML package in R. I first tried to enable this one XML file, but didn't know how to proceed further.
library(XML);library(xml2);library(methods);library(tidyverse)
#Read one XML file
filepath <- "C:/Users/PeriaPr/Desktop/repex1.xml"
onefile <- xmlTreeParse(gsub("'","",readLines(filepath)),asText = TRUE)
xmlroot <- xmlRoot(onefile)
var <- xmlSApply(xmlroot, function(x) xmlSApply(x, xmlValue))
Here is a reproducible example of my XML file. The single quotes (around Orange, Apple and Banana) need to be removed in this multiple tree hierarchical structure. The quotes occur almost 50 times within one XML file, and I need to process (delete single quotes) 1000s of XML files
<?xml version = "1.0" encoding = "windows-1252"?><document id="myrepex.xml">
<action_step step_no="1.3.1.1">
<step>1</step>
<title><![CDATA[Part1 - 'Apple']]></title>
<start><![CDATA[2019/08/09 7:57:17]]></start>
<duration><![CDATA[0 Hr. 12 Min. 22 Sec.]]></duration>
<status><![CDATA[Passed]]></status>
</action_step>
<action_step step_no="1.4.1.1">
<step>2</step>
<title><![CDATA[Part2 - 'Orange']]></title>
<start><![CDATA[2019/08/09 8:09:39]]></start>
<duration><![CDATA[0 Hr. 32 Min. 55 Sec.]]></duration>
<status><![CDATA[Passed]]></status>
</action_step>
<action_step step_no="1.5.1.1">
<step>68</step>
<title><![CDATA[Part3 - 'Banana']]></title>
<start><![CDATA[2019/08/09 8:42:35]]></start>
<duration><![CDATA[0 Hr. 36 Min. 28 Sec.]]></duration>
<status><![CDATA[Passed]]></status>
</action_step>
<action_step2 secondchild="secondchild">
<action_step2subchild subchild="subchild">
<title><![CDATA[Part3 - 'Banana']]></title>
</action_step2subchild>
</action_step2>