Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201839

Map trough a nested list using regex to remove entrys of a character vector

$
0
0

I have a nested list (https://www.filehosting.org/file/details/841630/bribe.RData) which I want to transform into a tibble. In the list are some character vectors which differ in length from the other vectors. This is due to a webscraping problem which could not be fixed before in another question.

Since all extra character strings in these vectors have a specific pattern I wrote a regex to find and remove those. These extra strings only appear in the 5th character vector of each sublist. How can I map over all those sublists and vectors to get rid of the extra strings at once.

The mapping version should extract the following of every 5th character vector of each sub list : MMW :

bribe.test <-stringr:::str_remove( bribe.info[[1]][[5]],
 "(\\\r\\\n\\s*){3}.+(\\\r\\\n\\s*){2}")

How can I do this? Is transpose() in combination with simplfy_all() an option? If so, how do I exactly use this here in this case?

The final tibble should have the structure that all character vector x of each sublist create one column in the data.frame.

UPDATE : To make things a little bit more accessible I will include the output of the list bribe.info[[1]][[5]]

[1] "\r\n Hello sir, My uncle just coming india yesterday >night at ahmedabad airport from New Zealand. And i gave him 2 iphone , iphone 8 >plus and iphone 11 pro...Read more\r\n "
[2] "\r\n Date of the incident: 29th December 2019\nTime of >incident: Around 8 PM in the evening\nPlace of incident: ECR road, Pondicherry >to Tamil Nadu check pos...Read more\r\n " [3] "\r\n Dear Sir,\n\nThis is not the first time I am >facing this issue with Rohit Gas Agency. I tried to bring it to the notice of >Indane. Its of no use. Rohit ...Read more\r\n " [4] "\r\n \r\n \r\n >How to get a LPG gas connection\r\n \r\n "
[5] "\r\n I paid bribe today to a police officer who came >for passport verification of my mother. Even after providing all supporting >documents and required inf...Read more\r\n "
[6] "\r\n I have asked to pay bribe to avoid huge penalty >for putting tent sheet on car windows. Police asked me to pay 1100 rs fine or >pay bribe instead of tha...Read more\r\n "
[7] "\r\n Help desk officer prashant who are trapping people >to make work done by giving bribes to higher officials at malakpet rto malakpet >Hyderabad ...Read more\r\n "
[8] "\r\n Get free shipping when you buy the Revolution the >great american electric cigarette machine, within the continental US from >https://hardworkingproduct...Read more\r\n "
[9] "\r\n I Would like to Inform you that a lot of >corruption is going on in the DC Office Bangalore Urban Dept. I am not paid >bribe directly there is lot more...Read more\r\n "
[10] "\r\n Are you interested in selling one of your k1dney >for a good amount of 14Crore 7 cr Advance kindly Contact us now 9663960578 >.\n...Read more\r\n "
[11] "\r\n Are you interested in selling one of your k1dney >for a good amount of 14Crore 7 cr Advance kindly Contact us now, as we are >looking for k1dney donor, ...Read more\r\n

The regular expressions needs to filter vector 4 of the list. Which works fine. But I have no idea how I can do this for bribe.info[[2]][[5]],bribe.info[[3]][[5]]... and so on with map.


Viewing all articles
Browse latest Browse all 201839

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>